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DescMpti n 



•hereof, peptides and 

such as in fermentation, polypeptide production assav -o^lf , f 2 & ^ nuc,eot,des and sequences thereof . 

The genus Staphylococcus includes at least' S>ZLT« Pha ™ ac [: eut ' ca, development, among others. 
ecus as a Molecular Genetic Sv^tem Chlw i Z P6C ' eS (F ° f 3 reView see Novick - * P- The Staphyloco- 

R- Novick. Ed.. VCH I^S^^^^P^^ 01 ^ BIOLOGY OF THE SWWVlcSS. 



Human Health and S. Aureus 
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debris on the burn surface (-^)^^^^ZZ^ ^ * l ° ,he n °™ iable 

and it may reach below the skin enter the IvrnZ^nH h ^ ° n 3nd ' nVade Viable tissue be,ow »» eschar 

among the most important palhCrZS^ d f Ve, ° P Septicae ™ & — - 

produce severe septicaemia. WOUnd ,n,ect,on s It can destroy granulation tissue and 



Cellulitis 



.aye'r^ 

in fact, cellulitis can be one asperf of sZra^ b aZ^ n T Ce,lu,itis can ■«* to systemic infection. 

S. aureus and microaerophi.ic ^ C ° ndrti ° n * Cau " d by a mixture of 

The condition often is fatal eCr ° S,S and ,rea,ment « limited to excision of the necrotic tissue 
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Joint infections 



^ S. aureus infects bone joints causing diseases such osteomyelitis 

Osteomyelitis 
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children and adolescents more than adults and it is associated with non-penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations. Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones. 

5 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
10 Recurrent infections of the nasal passages plague nasal carriers of S. aureus. 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body. Infection of such wound thus poses a grave risk to the patient. 
is s. aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

20 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis, Ritter's disease and 
Lyell's disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
25 produce exfoliation (also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe injury in young children if it is not 
treated properly. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
3S d isease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ("NNIS") S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable. 
45 Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully. The emergence of penicillin-resistant strains of S. aureus did not take long, however. Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
for S. aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
so pencillinoic acid, and thereby destroys antibiotic activity. Furthermore, the lactamase gene often is propagated episo- 
mally, typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 
multidrug resistance. 

Methicillins, introduced in the 1960s, largely overcame the problem of penicillin resistance in S. aureus. These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
55 make penicillin a good substrate for inactivating lactamases. However, methicillin resistance has emerged in S. aureus, 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides, tetracycline, 
chloramphenicol, macrolides and lincosamides. In fact, methicillin-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon et al. Micro 
biology ReviewsSV. 88-134 (1 987)). Generally, resistance is mediated by plasmids. as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently involve inteoration 
of a resistance element into the S. aureus genome itself. 

Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple druqs 
and increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
signrficant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 

Despite its importance^ among other things, human disease, relatively little is known about the qenome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophaqes 
ps.1 1 ps.1 2 and psit 3. and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450) which is free of 
15 the prophages. 

These studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements such as 
prophages, plasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et al. published a low res- 
olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325 (Pattee et al. 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325 Chapter 11 pqs 163-169 in 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P. Novick, Ed.. VCH Publishers. New York (i 990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn400l, which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE") 

The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 
investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown Among 
the few genes that have been identified, most have not been physically mapped or characterized in detail Only a very 
few genes of this organism have been sequenced. (See, for instance Thornsberry, J. , Antimicrobial Chemotherapy 21 
SubpIC: 9-16 (1988). current versions of GENBANK and other nucleic acid databases, and references that relatelo 
the genome of S. aureus such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 
expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
& aureus would provide reagents for. among other things, detecting, characterizing and controlling S. aureus infections 
There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID N0S:1-5,191. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1 -5.1 91 

The nucleotide sequence of SEQ ID NOS:1-5.191. a representative fragment thereof, or a nucleotide sequence 
which is at least 95%. preferably 99% and most preferably 99.9%. identical to the nucleotide sequence of SEQ ID 
NOS:1-5.191 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
to:magnet.c storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify-commercially 
important Iragments of the Staphylococcus aureus genome. . « ,u 

Another embodim nt of the present invention is directed to fragments, preferably isolated fragments of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
s aureus genome of the present invention include, but are not limited to, fragments which encode pepHdes hereinafter 
referred to as open reading frames or ORFs," fragments which modulate the expression of an operably linked ORF, 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or DFs^ 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
io found 5" to the ORFs. can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the present of a specific microbe in 
a sample to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological actrvity^ 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
is lococcus aureus genome of the present invention. The recombinant constructs of the present invent.on compr.se vec- 
tors such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial celL 
20 The present invention is further directed to polypeptides and proteins, preferably isolated Po VPept'des and pro- 

teins encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art. routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present inven ion also may 
2S be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacc.niat.ng an .n- 
dividual aqainst Staphylococcus aureus infection. 

The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 
genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specif- 
ically by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homology 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present .nventran. 

SB SucR^tlbc^ies-incladebothmc>noclonal-and-polyclonal^ — — . . a ic gn 

The invention further provides hybridomas which produce the above<lescnbed ant.bod.es. A hybndoma .s an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. • 

The present invention further provides methods of identifying test samples derK/ed from cells wh.ch express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test samp e wrt h on 
40 or more of the antibodies of the present invents, or one or more of the Dfs or antigens of the present under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced theref kwl 
In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 

out the above-described assays. .. irtare 
Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more c^ 

45 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
Z^tion and (b) one or more other containers comprising one or more of the following:wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybr.dized DFs moth ^ c of ohtainina 

Using the isolated proteins of the present invention, the present invent.on further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFe > of the present 

50 invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates pha^Tna- 
ceutica. agents and the like. Such methods comprise steps of: Contacting an agent with an .solated prote.n encoded 
by one of the ORFs of the present invention, and (b)determining whether the agent binds to sa.d P; otein 

The present genomic sequences of Staphylococcus aureus will be of great value to all laboratones working wrth 
this organism and for a variety of commercial purposes. Many fragments of the Staphylococcus a ^enome w.„ 

55 be immediately identified by similarity search s against GenBank or protein databases and w.ll be of ™med.ate va ue 
to Staphylococcus aureus researchers and for immediate commercial value for the product.on of protems or to control 

The P rShodology and technology for elucidating extensive genomic sequences of bacterial and other genomes 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus aureus sTrains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present .nvention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC ). 
The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 

s ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% KienticaUr. 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS:1-5,191. Nearly all w.ll be at least 99% 
identical and the great majority will be 99.9% identical. 

Thus the present invention further provides nucleotide sequences which are at least 95%. preferably 99 /» and 
most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1 -5,1 91 , in a form which can be read.ly 

10 used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99^9% .denial 
to the nucleotide sequences of SEQ ID NOS.1 -5, 1 91 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. Sci. USA8S. 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 

is an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide 
20 sequence at least 95%. preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ ID NOS1-5.191 may be -provided" in a variety of mediums to facilitate use thereof. As used herein, 
Oprovided" refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS:1 -5.191. a representative fragment 
thereof or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9 /« identical 
25 toa polynucleotide of SEQ ID NOS:1 -5. 1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
qenome and parts thereof (e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a sk.lled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 
30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer. Such media include, but are not limtted to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such , as 
RAM and ROM and hybrids of these categories, such as magnetic/optical storage media. A sk.lled artisan can read.ly 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 
_ 3S — p^ing-con^uter-re^ of the pr esent invents Likewise, 

it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A sk.lled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 
40 to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
general* be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 
45 readable medium. The sequence information can be represented in a word processing text file, formatted ^ comme, - 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII f'e^tored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can read.ly adapt any numbe of 
data-processor structuring formats {e.g., text file or database) in order to obtain computer readable med.um hav.ng 
recorded thereon the nucleotide sequence information of the present invention. . , .. ^ in 

so Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEC I ID 
NOS 1-5 191 a representative fragment thereof, or a nucleotide sequence at least 95%. preferably at least 99 A and 
most preferably at least 99.9% identical to a sequence of SEQ ID NOS:1-5,191 the present invent™ enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes 
55 The examples which follow demonstrate how software which implements the BLAST (Altschul ef al. J. Mol. Biol. 

215 403410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algonthms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus 9«"ome^ K h^tam 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms. Among the ORFs discussed 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside inmain memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

5 BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
10 modulate the expression of an operably linked ORF. hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 

fragments (DFs). , , . 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to pur,f,cation 
is means to reduce, from the composition, the number of compounds which are normally associated w.th the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEC I ID NOS:1-5 191. to 
representative fragments thereof as described above, to polynucleotides at least 95%. preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
20 include, but are not limited to methods which separate constituents of a solution based on charge, solub.lrty, or size. 

In one embodiment. Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below. Primers flanking, for example, an ?" F ; ^ 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5.191. Wei 
25 known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA l.brary of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID NOS:1-5.191. 

1 2 and 3. and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-5.191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF«onta.n.ng or 
other nucleic acid fragment of the present invention. =„h 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA. „,» h „ llt ar>w 

As used herein, an "open reading frame." ORF. means a series of triplets coding for ammo acds without any 
termination codons and is a sequence translatable into protein. 

Tables 1 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 

_ 35 ^ntif^H as putative coding reg ions by the GeneMark software using organism-specific seco nder Markov proba- 

bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analyt.cal 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective hat* 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
40 an S aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 

tem Table 9 3tets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly. 

45 by BLASTP analysis, a polypeptide sequence available through Genbank by September 1 996. „„ mHer 
In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the > contig as the start of 
the + 1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5 end of the contig strand, 
and the fifth column indicates the length of each ORF in nucleotides. r . nl( 

so m Tables 1 and 2. column six. lists the Reference" for the closest matching sequence available through Genbank^ 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclature are available from the Nat.onal Center for Biotech- 
nology information. Column seven in Tables 1 and 2 provides the gene name" of the matching sequence, column eight 
provides the BLAST identity" score from the comparison of the ORF and th homologous gene; and column nine 

55 indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 
In Table 3, the last column, column six, indicates the length^ each ORF in amino acid residues. 

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistaftt strains. 

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequence to DNA or RNA. Triple helix- formation optimally results in a shut-off of RNA transcnption from DNA. 
s while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 
10 and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et ai, Nucl. Acids Res. 6: 
3073 (1979) Cooney et ai, Science 241: 456 (1988); and Dervan et ai, Science25V. 1360 (1991). Antisense tech- 
niques in general are discussed in. for instance, Okano. J. Neurochem. 56: 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 
is lococcus aureus genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further 
20 comprise a marker sequence or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example Useful bacterial vectors include phagescript, PsiX174, pBluescript SK and KS (+ and -). pNH8a. 
pNH16a pNH18a. pNH46a (available from Stratagene); P Trc99A, pKK223-3, pKK233-3. pDR540. pRIT5 (available 
25 from Pharmacia). Useful eukaiyotic vectors include pWLneo, P SV2cat. P OG44, P XT1 , pSG (available from Stratagene) 
pSVK3 pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are P KK232-8 and P CM7. Particular named bacterial pro- 
moters include lad, lacZ, T3. T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV 
30 thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate 
vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaiyotic host cell, such as a mammalian cell, a lower 

-35 eukaiyotic host -cell.-such-as a-yeasLcell,xr_a.prflcaj yc>tic .cell, such as a bacterial cell . 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are descnbed in. 
for instance. Davis. L. et ai, BASIC METHODS IN MOLECULAR BIOLOGY (1 986). 
40 A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF. 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant is 
45 intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but. due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 
Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 

Pf0t A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
so of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating ant.bod.es 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 
ss polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain 
or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not 
limited to. immunochromatography. HPLC, size-exclusion chromatography, ton- xchange chromatography, and immu- 
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may. also be employed as a matter of choice. 

As a representative but non-limiting exampl , useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or 
io chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. 
Thereafter ceils are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23: 175 
is (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5* flanking nontranscribed sequences. DN A sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

2S cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively "immunologically useful polypeptides". Such im- 

30 munologicaliy useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful. According to Izard, J. W. et al., Mol. Microbiol. 13, 765-773; (199 4) , po l ypeptides containing typ e 
signal sequences contain the following physical attributes: The length of the type fsignal sequence is approximately -- 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S., J. Bac- 
terid. 174, 7345-7351 ; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

45 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

so contained the sequence L-(A,S)-(G, A)-C at positions -3 to +1 , relative to the point of cleavage (Hayashi, S. and Wu, 
H. C. Lipoproteins in bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990). 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E. 
faecalis, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy terminal 

55 amino acid sequence (Fischetti, V A. G ram-positive commensal bacteria deliver antigens to elicit mucosal and systemic 
immunity. ASM News 62, 40541 0; 1 996). The cons rv d region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved in nearly all proteins ex- 
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amined. The amino acid sequence of this region is L-P-X-T-G-X, where X is any amino acid 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include -lipoprotein", "periplasmic", or "antigen" 

An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foreooino 
criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
fn^o c S ?^o ! ed c P o°i e '"! Th6Se Pr ° teinS are iden,ified in Table 4 - below, and shown in the Sequence Listing as SEQ 
, ° ; V, ,he amin ° acid se 9 uence oi each of several antigenicSfapnyfococcus aureus polypeptides 

hsted in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing Likewise the polynucleotide sequence encoding each ORF can be found by locating the correspondinq poly- 
nucleotide SEQ ID in Tables 1. 2. or 3. and finding the corresponding nucleotide sequence in the sequence listinq 

As will be appreciated by those of ordinary skill in the art. although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 
in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods 
as a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5, 192-5.255 may have been 
modified sl.ghtly to simplify the production of recombinant protein, and are the preferred embodiments In qeneral 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal siqna'l 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 
ammo acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4 and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5, 192-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the correspondinq 
ORF listed in Tables 1 .2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 
amino acd sequences shown in the sequence listing as SEQ ID NOS:5. 191 -5,255. or an amino acid sequence at least 
95 /o identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 
polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4 The 
ep. ope-beanng portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope " 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes See for 
instance. Geysen et al_. Proc. Natl. Acad. Sci. USA 81 :3998- 4002 (1 983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e.. that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein 
See. for instance. Sutcliffe. J. G.. Shinnick, T. M., Green. N. and Learner, R. A. (1983) "Antibodies that react with 
predetermined sites on proteins". Science, 219:660-666. Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules and 
are con ined nerther to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
«> f A ? ?' .'"^ding monoclor >a' antibodies, that bind specifically to a polypeptide of the invention. See, for instance 
so Wilson et al., Cell 37:767-778 (1 984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 toabout 30 amino acids contained within 
he am.no acd sequence of a polypeptide of the invention. Non-limiting examples of antigenic polypeptides or peptides 
« x f , Can l ° 9enerate S aureus s P ecific antibodies include: a polypeptide comprising peptides shown in Table 

4 below. These polypeptide fragments have been determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means 
See, e.g., Houghten. R. A. (1985) General method for the rapid solid-phase synthesis of large numbers of peptides 
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specificity of antigen-antibody interaction at the level of individual amino acids. Proc. Natl. Acad. -Sci. USA 82: 
5131-5135; this 'Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 
4,631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe et al., supra; Wilson et al., supra; 
5 Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et aL, J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen etal., supra. Further still, U.S. Patent No. 5,194,392 toGeysen (1990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 
10 epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear C1 -C7-alkyl peralkylated oligopeptides and sets and libraries of such 
is peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 
20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first amino acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame included within the epitope. For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 
25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention 
further provides polynucleotides encoding such polypeptides. 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 
35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or profeinorSfap/y/ococcus aureus ~is~defihed-as~a- 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 
40 sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 
45 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those with 99.9% homology or more, it will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1 -5, 1 91 or from 
so a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS:1-5,191 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing 
cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as, for example, Innis et al, PCR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

ss When using primers derived from SEQ ID NOS:1-5,191 or from a nucleotide sequence having an aforementioned 

identity to a sequence of SEQ ID NOS:1-5,191. one skilled in the art will recognize that by employing high stringency 
conditions {e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 
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conditions (e.g., hybridizing at 35-37»C in 5X SSPC and 40-45% formamide, and washing at 42°C fh 0 5X SSPCi 
sequences which are greater than 40-50% homologous to the primer will also be amplified 

When using DNA probes derived from SEQ ID NOS:1 -5,191 . or from a nucleotide sequence having an aforemen- 
t.onec identrty toa sequence of SEQ ID NOS:1-5.19l . for colony/plaque hybridization, one skilled in the art will recoq- 
nize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide and 
washing at 50- 65'C in 0.5X SSPC). sequences having regions which are greater than 90% homologous to the probe 
C :TJll C ! >ta,ned ' and th3t by em P lovin 9 tower stringency conditions (e.g., hybridizing at 35-37'C in 5X SSPC and 
40-45 /o formam.de. and washing at 42°C in 0.5X SSPC). sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

Any organism can be used as the source for homologs of the present invention so long as the organism naturally 
expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homoloqs 
are bactenas which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

Each OFF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
.ndustnal purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one 
stalled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 
the identif.cat.on is made; for example, to ferment a particular sugar source or to produce a particular metabolite A 
variety of rev.ews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example. BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK 2nd Ed Mac 
millan Publications. Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et al ' Eds Elsevier 
Science Publishers. Amsterdam. The Netherlands (1 985). A variety of exemplary uses that illustrate 'this and similar 
aspects of the present invention are discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 
macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes en- 
zymes involved ,n the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
am.no acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis 
can be used for industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1-3 and SEQIDNOS l-5 191 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase 

Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 
numberof mdustnal processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depectmization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
etal., Symbiosis 21: 79 (1 986) and Voragen etal. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY, Whitak- 
er et al., Eds., American Chemical Society Symposium Series 389: 93 (1 989) 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus Enzymes 
involved «i the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 
ox.dases which produces ketogulonic acid (KGA). KG A is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described in Krueger et al., Biotechnology 6(A). Rhine et al Eds Verlao. 
Press, Weinheim, Germany (1984). '-»— « - •■ a 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
torm for the deoxygenation of beer. See, for instance, Hartmeir etal., Biotechnology Letters V 21 (1979) The most 
important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described 
tor example, in B.gelis et al., beginning on page 357 in GENE MANIPULATIONS AND FUNGI Benett et al. Eds ' 
Academic Press, New York (1985). In addition to industrial applications. GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described in Owusu et at, Biochem. et Biophysica. Acta. 872: 83 (1 986), for 
instance. 

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field 
5 of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger et al., Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 
10 Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman et ai, Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and 
Godfrey et ai, Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et al, Report Industrial En- 
's zymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for 
instance, Macrae et ai, Philosophical Transactions of the Chiral Society of London 310:227 (1 985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 
20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et at, Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, 

30 and carbon bond forming reactions such as the aldol reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud et ai, Chemistry in Britain 

35 ( 1987 ), p. 127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of amino transferases for amino acid production is provided by Roselle-David, Methods 

40 of EnzvmolQQV 1 36:479 (1 987) 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus. These include Sau3A and Sau96l. 

is 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
so either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody. 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pro- 
55 ducing the desired antibody are well known in the art (Campbell, A. M., MONOCLONAL ANTIBODY TECHNOLOGY: 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, Elsevier Science Publishers, Am- 
sterdam, The Netherlands (1984); St. Groth etal, J. Immunol. Methods 35: 1-21 (1980), Kohler and Milstein, Nature 
256 : 495-497 (1 975)), th trioma technique, the human B- cell hybridoma technique (Kozbor et ai, Immunology Today 
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4: 72 (1983), pgs. 77-96 of Cole et al., in MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. 
(1 985)). 

Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include but 
are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag1 4 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz et at., Exp. Cell Res. 175: 109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures 
known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)). 

Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778) can be adapted to 
produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above<Jescribed procedures. 

The present invention further provides the above- described antibodies in detectably labelled form Antibodies can 
be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.). enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine etc) 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger et al, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A eta/., Meth. Enzym. 62:308 (1979) Engval 
E. etaf., Immunol. 109:129 (1972); Goding, J. W. J. Immunol. Meth. 13:215 (1976)). 

The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells 
or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
are well known in the art (Weir, D. M. et al., "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. etai, Meth. Enzym. 34 Academic Press, N. Y (1974)). 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 

*o 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs.antigens or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 
the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification 
or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R et al Techniques in 
Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291 and 
Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985), all of which are hereby incorpo- 
rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine. The test sample used in the above-described method will vary based 
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on the assay format, nature of the d tection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 

s out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
which comprises:(a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the following:wash reagents, reagents capable of detecting 
presence of a bound DF, antigen or antibody. 

io In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such 

containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 

is contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 

20 present invention can be readily incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
25 and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described. 
In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 usin g protein modelin g techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one 
40 skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et at., Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W. H. Freeman, NY (1 992), pp. 289-307, and Kaspczak et at., Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like. 

45 in addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 

50 one class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can 
be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee et a/., Nucl. Acids Res. 6:3073 (1 979); Cooney 

55 era/., Science 241:456 (1988); and Dervan et at., Science 251: 1360 (1991)) or to the mRNA itself (antisense - Okano, 
J Neurochem. 56:560 (1991); Oligodeoxy nucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton. FL (1988)). Triple helix-formation optimally results in a shutoff of RNA transcription from DNA, while antisense 
RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated 



19 

«NSDOCID: <EP 078651 9A2_I_> 



EP 0 786 519 A2 



10 



is 



20 



25 



30 



35 



40 



45 



50 



55 



^nSr^H," T!f ' sys,ems ln,0 ™a«°n contained in the sequences of the present invent^ can beused to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

S. Pharmac utical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or oath- 

2Tnr Z h f af \ hylOCOCCUS 3UreUS ' ° r an ° ther re,ated ° r9aniSm - in Vivo or in vitro As used he ™"- a "pharmaceutical 
agent s defined as a composition of matter which can be formulated using known techniques to provide a pharma- 

aoenS f wS™ T ^ 7" T^' ^ " pharmaceutical a 9 ents °f ^ P<— * invention" refers the pharmaceutical 
agents which are derived from the prote.ns encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organ.sm, in vivo or in vitro, ' when the agent reduces the rate of growth, rate of division, or viability of the 
organ.sm in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 

^^T" 8 " 1 '"/IT* . aShi ° nS ' a,thOU9h an understandin 9 <* the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 

to » h 3 ^ P ° nent ° Uter SUrfaCe ° f ,hS ° rganiSm b,ocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 

ir 6 d P nTn n Jn«H en,i0n ^.l*™ 35 * 11,6 deve, °P ment and ^ <* vaccines derived from membrane asso- 

ciated polypeptides are well known ,n the art. The inventors have identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described above and sum- 
marizea in Table 4, below. 

^„ T U ^ T."?: 8 ' rela,ed organism " is a broad ,erm wnich ^fers to any organism whose growth or pathogenicity 
Z,?nT^ IT ° f thG P harmaceu,ical a 9ents of the present invention, in general, such an organism win 

conta.n a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
™L SUC ?K aS J ° ra1, t0piCal ' intravenous - intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and .n most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most 
cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

The agents of the present invention can be used in native form or can be modified to form a chemical derivative 

As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 

mo eties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 

SI « f I!! m ° ie , ,,eS ^ alterna,ive| y decrease the toxicity of the molecule, eliminate or attenuate any undesirable 

Rptf^™, SoZ"^ ■ ° tC MOi6tieS Capab ' e ° f media,in 9 effects are disclosed in, among other sources 
REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a grven antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type .mmunoassay. Modifications of such protein properties as redox or thermal stability, biological half- 
1L LJZZ J C,ty " ^ usc t ptibiMt y to P^oWc degradation or the tendency to aggregate with carriers or into multimers 
aiso may be effected in this way and can be assayed by methods well known to the skilled artisan 

hv a „?c,S PeU,K ef, f° ,S ° f thG agentS ° f the Present invention mav be ob,ained b V Priding the agent to a patient 
by any surtable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenteral^) It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
" h ' n 9rOWlh ° f ,hS or 9 anism is to be controlled. To achieve an effective blood concentration, the 

preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

» a Jl PrOV H din9 3 Pa,iem W " h ° nS °' thS agentS ° f the present mention, the dosage of the administered agent will 
hf<Zo ep TT 9 UP ° n f UCh <aCt0rS 88 the pa,ient ' s age ' wei 9 ht ' hei 9ht, sex, general medical condition, previous medical 
i 1/7' T\ r? 9B T ' * desirable to P rovide ,he reci P ient with a dosage of agent which is in the range of from about 
rJS^l° 1 ^ 9 ( ^ Wei9ht ° f Patient)> althou 9 h a 'ower or higher dosage may be administered. The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agent 

wh^nJhf «f '.r' T°\ compounds or a 9 ents are said to be administered "in combination" with each other 
when either (1) the phys.olog.cal effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 

5 The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. 

When provided prophytactically, the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

10 toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharma- 
ceutical^ acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 

is jcally significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutical^ 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e. 
g. t human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 

20 Osol, A., Ed., Mack Publishing, Easton PA (1 980). In order to form a pharmaceutical^ acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release. Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation techniques or by interracial polymerization with, for example, hydroxymethylcellulose or gelatine-micro- 
capsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, 

3 5 liposomes, albumin microspheres, microemulsions, nano particles , andnanocapsulesorinmacroemulsions. Suchtech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such containers) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

40 or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds. 
6. Shot-Gun Approach to Megabase DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples 
so are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 
« 1. Sh tgun Sequ ncing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1988)) application of the equation forthe Poisson distribution. According 
to th» treatment, the probability, P 0 . that any given base in a sequence of size L, in nucleotides, is not sequenced after 
a certain amount, n, ,n nucleotides, of random sequence has been determined can be calculated by the equation P„ 
= e- where m .s Lm. the fold coverage." For instance, for a genome of 2.8 Mb. m=1 when 2.8 Mb of sequence has 
been randomly generated (1X coverage). At that point. P 0 = e -i = 0.37. The probability that any given base has not 
been sequenced ,s the same as the probability that any region of the whole sequence L has not been determined and 
therefore, is equivalent to the fraction of the whole sequence that has yet to be determined. Thus, at one-fold coveraqe" 
approximately 37% of a polynucleotide of size L. in nucleotides has not been sequenced. When 1 4 Mb of sequence 
« o^? 9 enera,ed ' ^erage is 5X for a .2.8 Mb and the unsequenced fraction drops to .0067 or 0 67% 5X coveraae 
of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with 
an average sequence read length of 41 0 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le-». and the average gap size, g follows the 
nucleo°ide 9 2 = 8 M 0 b Jong ^ leaV6S ab ° Ut 240 9aps avera 9i"9 about 82 bp in size in a sequence of a poly- 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1988). 
2. Random Library Construction 

25 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required. The following library construction procedure was developed to achieve this end 
S ^ pAy ' ococct/s aureus DNA was Prepared by phenol extraction. A mixture containing 600 ug DNA in 3 3 ml of 

so M^^^t ,Um aC6ta,e ' 1 ° mM TriS " HC1, 1 mM Na - EDTA > 3° % Slycerol was sonicated for 1 min. at 0°C in a Branson 
Model 450 Sonicator at the lowest energy setting using a 3 mm probe. The sonicated DNA was ethanol precipitated 
and redissolved in 500 ul TE buffer. H 

Tocreate blunt-ends, a 100 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30«C in 200 ul BAL31 buffer . The digested DNA was phenol-extracted, ethanol-pre- 
ciprtated redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis through a 1 .0% low melting 

35 emperature agarose gel. The section containing DNA fragments 1 .6-2.0 kb in size was excised from the gel and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 

40 TnHn! I T ? aU °P miXtUre (5 ° Ul) contained 2 "9 °f DNA fragments, 2 ug P UC18 DNA (Pharmacia) cut with Smal 
l?^^ ^ W,th baCteria ' alka,ine P hos P na,ase . a nd 1 0 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14 c for 4 hr. The l.gat.on mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved ,n 20 ul TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder 
were visuahzed by ethidium bromide-staining and UV illumination and identified by size as insert (i) vector (v) v+i 

4s I'-rTJ^ POrti ° n °' ,hS 961 containin 9 v+i DNA was excised and the v+i DNA was recovered and resuspended 

into 20 ul TE. The v + i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37" C in a reaction mixture 
(50 ul) containing the v + i linears. 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs) 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v + i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v + i 

so stored a a t n -20°c n,tS °' ^ 119386 * ° Vemight A,ter 10 min at 70 ° C the ,ollowin 9 da V- the reaction mixture was 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.coli host cells deficient in all 
55 ntS? ' n 1 f " d r< f riCtion functions (A- Greener. Strategies 3 (1 ):5 (1 990)) were used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates toavo.d the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1 42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM. Cells were incubated on ice for 
10 min. A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42° C and placed back on ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

s mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2%), 1 ml MgCI 2 (1 M), and 1 ml MgSO 4 /100 ml SOB agar. The 1 5 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ul aliquot of transformation. 

10 All colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 

3. Random DNA Sequencing 

is 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with 5Prime -» 3Prime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 

constructed in Lambda DASH II vector (Stratagene). Staphylococcus aureus DNA (> 100 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ui of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment). Yield was about 
2.5x10 9 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufactureer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1x10 s pfu/ml. 

Mini-liquid lysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and 17 primers, and Elongase Supermix (LTI). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the M1 3 forward (M1 3-21) and the M13 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a PerkiivElmer 9600~ThermocycleT~ 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M1 3-21 and M1 3RP1 
sequences and 65% for dye -terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 
445bp for M1 3RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4. Protocol for Automated Cycle Sequencing 

The sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-iabelled sequencing 
50 primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler. Thirty consecutive cycles of linear amplification (i.e.. , one 
primer synthesis) steps were performed including denaturation, annealing of primer and template, and extension; i.e., 
DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
anoilov rlay. 

55 Two sequencing protocols were used: one for dye-labelled primers and a second for dye-labelled dideoxy chain 

terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined into one lane of th 373 or 377 DNA Sequencer for electrophor sis, detection, and base- 
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calling. ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable 
sequences. 

Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 
per day. Electrophoresis was run overnight (ABI 373) or for 2 1/2 hours (ABI 377) following the manufacturer's protocols 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 
itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed (For review 
see, for instance, Kerlavage era/., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 
Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 
on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 
fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements 
Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith- Waterman algorithm which provides for optimal gapped alignments (Waterman M S 
Methods in Enzymology J64: 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 
end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 
coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf , 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity Those ORFs with nucleotide sequence 
matches are shown in Table 1. The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases 
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ORFs of at least 120 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

5 

1. Producti n f an Antibody t a Staphylococcus aureus Prot in 

Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
10 E. coli, or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

15 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 

20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused 
cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980), and modified meth- 

25 ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L. et at. Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1 989). 

3. Polyclonal Antibody Production by Immunization 

30 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 

35 carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
xcessive doses of antigen resulting in low titer antisera. Srnall~doses (r^lewl)^f^tigenadministe^ 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J. era/., J. Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See, for example, Ouchteriony, O. et at., Chap. 19 in:Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 2 mg/ml of serum 
(about 1 2M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher, D., Chap. 42 in.Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. 

45 Soc. For Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

so the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS: 1-5,191 

can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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mately the same. The PCR primers and amplified DNA of this Example find use in the Examples thatlollow. 
4. Gene expression from DNA Sequen es Corresponding to ORFs 

5 A fragment of the Staphylococcus aureus genome provided in Tables 1 -3 is introduced into an expression vector 

using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California) 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 
10 proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield et at., U. S. Patent No. 5,082.767, incorporated herein by this reference 
The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 
« Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5* primer and Bgll I at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease, digested with Bgll I, purified and 
ligated to pXTl , now containing a poly A addition sequence and digested BgllL 

25 The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand island 

New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Alternately and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 

3S globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of expression These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

40 texts such as Davis era/., cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

45 scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated by reference. 
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202-21 1 


222-231 


261-270 


58.5 


1 84-203 


260-269 


275-299 


330-344 


372-381 


424-433 


188.3 i 


236.6 


138-147 


163-172 


187-198 


244-261 


268-278 


308-317 


310.8 


131-140 


144-153 


177-186 


! 190-199 


204-213 


216-227 


601.1 


208-218 












544.3 


170-179 


184-193 


224-235 


274-287 


327-336 


352-361 


662_1 


87.7 - ' i 


120.1 i 
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Table 4 



ORF 


\ 
i 


Antigenic 


j Regions 


j(cont) 




j 




! Region 1 1 


Region 1 2 


! Region 13 


: Region 14 


Region 15 


j Region 1 6 


168.6 


l 




t 


1 




1 


238.1 II : 


51.2 






i 


i 






278.3 






j 






i 


276.2 


i 




1 


i 




i 


45.4 


i 




1 


i 






316.8 








i 


1 54_1 5 


i 








i 


228.3 ! 








228.6 










50.1 


l 








112.7 


i . 








442.1 


i 




i 




66.2 


i 




i 




304.2 










44.1 






r 

1 






161.4 












46.5 


306-315 










942.1 i ; 




t 




5.4 


393-407 


! 416-426 


456-465 






20.4 


396-405 


: 410-419 


461-481 






328.2 


I 




i 

i 




520.2 


1 




i 




771.1 


J 








999.1 










853.1 ! 









287.1 ! 




' i 


288.2 ! 




: ! 


596.2 i i l i 


217.5 1 I 


1 


217.6 i 








528.3 t 








171.11 ; 








63.4 | 








353.2 ! 








743-.1 | 






342.4 i 








69.3 ! 


j 




70.6 ! 


453-471 


506-515 






129_2 i 


296-315 




t 








58.5 








188.3 






i 

1 i 


236_6 


358-377 


410-423 


428-439 1442-457 


467-476 


480-493 


310.8 i 


238-251 


256-275 


281-290 ' 


296-310 


314-333 


3*38-347 


601.1 i 


! I 


544.3 I ( 




662.1 ! 




87.7 i ; 




120.1 
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ORF 




Antigenic; 


Regions 


(cont) ; 








1 Region 1 7 


Region 1 8 i 


Region 19 


Region 20 ; 


Region _21_ 


Regipn 22_ 


168_6 


i 


i 








238.1 i • 


51_2 














278.3 


1 


i 










276.2 i ! ; ! 


45.4 


1 










i 


5 i 6_8 














154.15 


! 






i 






228_3 I i i i 


228_6 




i 


50.1 ; 


'. 




i . 


1 1 2.7 1 


; 


■ 


442.1 


! 




66.2 ! 


i 


i 


304.2 1 






44_i ; 










161_4 


i 






-1 




i 


46_5 


-\ — 


1 


; 






942_1 








i 






5_4 


i 












20_4 i 1 ! ! 


328.2 j ! 1 




520.2 1 ! 




771_1 














999_1 1 1 




i 


853_1 ! 1 




287_1 ; 


i ! 


288.2 1 


1 

i ; 


596.2 i 1 j 


"217.5 \ 1 


217_6 ! 1 


528.3 1 1 ■ ! 


171.11 lit. 


63_4 " . ill 


353.2 j 1 ! 


743- 1 ' i 


342_4 


i 








i 


i 


6913 










j 


i 


70.6 ! ! : 


129.2 : 


58.5 1 


188.3 


236.6 : 


310.8 


357-366 


370-379 


1429-438 


443-452 


478-487 


551-560 


601.1 














544.3 














662.1 














87.7 


120 1 
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Table 4 



ORF 




Antigenic 


Regions 


'(cont) 




■ 




Region 23 


Region 24 


! Region 25 


! Region 26 


Region 27 


Region 28 


168.6 








» 






238.1 






i 








51.2 














278.3 i I 








276.2 


i 




L__ 






- 


45.4 






! 






.... 


316.8 














154_15 












228.3 






L 






228.6 


\ 


1 






. 50.1 ! 








1 1 2.7 I 








442.1 | 




• 




66.2 








i 




304.2 I 


. 


i 




44.1 i 










161.4 




■ 










46.5 


i 












942.1 j 











5.^ : 








20.4 S 








328_2 ! 




1 




520.2 i 








771.1 i 








999.1 ! I 






853.1 \ \ 






287.1 i j 




288.2 














596.2 














21-7.5 ? : 


217.6 f ! 




528.3 ! t 




171.11 i i 




63.4 ; I 




353.2 


\ 












743.1 














342_4 




69,3 t 


70.6 ! 


129.2 














58.5 














188.3 














236.6 










310_8 


622-632 


670-685 


708-718 


823-836 


858-867 


877-886 


601.1 


j > 


544.3 


i 

; ' i 


662.1 




87.7 


; ! 


120.1 
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15 



20 



25 



30 



35 



40 



45 



SO 



55 



ORF 


Antigenic 


Regions 


(cont) 




Region 29 


Region 30 ; 




168_6 , ; 


238.1 i 


51.2 


i 






278_3 i i 


276_2 I ! 


45.4 i 




! 




31 6_8 








154.15 








228_3 I 








228.6 








50_1 i 




1 1 2.7 i 




442.1 ! 


i 


66.2 i 




304.2 


i 


44.1 


i 


161.4 


y 


46.5 


j 


942_1 




5.4 








20.4 








328.2 i 


520.2 




771.1 






999.1 




853.1 


i ! 


287.1 






288_2 


i 




596.2 


j i 


217_S 1 ! 


217.6 ' i I 


528.3 i ! 


171.11 ! • 


63.4 1 ! ! 


353.2 ' ! 




74J.1 ' 




342.4 i i ! 


69.3 1 


70_6 : 


129.2 








58.5 








1 88.3 


236.6 


310.8 


601.1 








544.3 








662.1 


87.7 


1 20.1 
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Table 4 



ORF 



BLAST 



461 ;5241 



Antig enic i Re g ions 

HOMOLOG Reg ion 1 Regi on 2 

._i!^!XS!€ 5?5J^f^£2SS 8-1 7 _ _36l 52 . 

6 3_4 5242 glycerol e ster hydrolase (P. 9-26 i J 57-73 

17 4_6 ^AlJsejCopjnton yj -80_ 203*^2 Jz_ 

206_1 6 5244 jornithine acetyltransf erase 1 -10 ~~ ; 34-43_ 

267.1 i 5245 NaH-ant ipo rter protein (E. r 120-129 i 332-347 



JJ22.1 |5246 
415.2 5247 
214_3 !5248 



acrifjayin resistance protein 
— transpor t ATP-b inding prou 



58-75 
108-126 



153-164 



__Region 3 
J3-96^ 
93-107_ 
242-2S4 
54-63 ~ 
398-40 8 
203-231 



218-227 



587.3 '5249 Lfjumping factory 

-§!9£*§i peptidase 



685.1 



54.3 



; 5250 
15251 



2^nitrppro pane dioxy g enasc 123-136 i 216 -233 

43-54 
72-81 



5-14 



59-68 



^bron^tin binding P£OteinJ _ 23-32 j 37-46 

_54.4 5252 fibronecti n binding protein 1 43-S2 6 6~75 

54_5 1525 3 fibr onectin bindin g prot ein I 49-60 I 81-90 



_298-308 
283-2*92" 



,54.6 



328^1 



5254 fibronecti n bindin g protein I 5 5-71 

5255 lipoprotein (H. flu) i 1-20 



82-97 



61-70 



_J9_-68 
__86^5_ 

5~0-5?_ 
95-104 



139^158 
96-105 



Region 4 

123-133" 
265_-274 
194-210 

J 64^284^ 
_3 1 5-334 
_297-306 
_76^5 
. 99-108 
89-98 
,147-lsV 

175-18T 



Table 4 

25 



ORF 




Antigenic; Regions 


(cont) 




- 




Region 5 


Region 6 


Region 7 


Region 8 


Region 9 


Region 10 


46.1 


215-242 


333-352 


376-385 


416-432 


471-487 i 


63_4 


145-154 


191-202 


212-223 


245-26S 


274-283 


291-300 


174.6 














206.16 


239-259 j 275-284 











267.1 






... 








322.1 


298-319 


350-359 










41S_2 


344-353 


371-380 


395-404 


456-465 


486-495 


518-527 


214.3 


318-337 


365-375 










537_3 


106-115 " 


142-151 


156-166 


173-182 


186-198 


204-213 


685.1 


: 113-122 


130-145 ! 








54.3 


! 128-138 


185-194 ! 217-226 


251-260 


268-277 


29S~305~ " 


54.4 


\ 175-188 


191-200 


203-212 


220-229 






54_5 


1 








1 ■ ■ 




54.6 


i 220-230 


287-304 


317-326 


344-353 


364-373 


378-387 


328.1 ! 













45 



50 
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Table 4 



ORF 


i 


Antigenic 


Regions 


{cont) 


! 






Region 1 1 


Region 1 2 


• Region 13 ! 


Region 14 


; Region 1 5 


Region 17 


46.1 


i 




i i 








63.4 


306-315 


319-328 


366-37G 1395-420 


453-462 


467-476 


1 74.6 






1 

; i 


- 




206^1 6_ 


-;- : 





! i 








267.1 


1 ' 




I ! 




i 


I 


322_1 
415.2 


l 

539-555 




i 




i 
• 




214.3 


i 










i 


587.3 


217-226 


278-_287_ 


1 318-327 


332-342 


1351-360 


J3ZZl386 


685.1 
543 


7""3i6r32s~"! 




I _ 

355-372 


r _ _ 

387-396 


~ 4T6^42S 


i 

^13 8^4 4 8 " 


54.4 










1 




54.5 
54_6 


! 396-407 


427-436 


I 514-531 


541-550 


! 569-578 


1612-622 


328.1 






1 




1 





Table 4 

25 



ORF 




Antigenic 


Regions 


; (cont) 




_ Region 23 




: Region 18 


Region 1 9 


Region 20 


i 


Region 21 


Region 22 


46_1 








63.4 


.485-500 


51 3-525 








174.6 








r 

i . 






206_16 
















267.1 








i 








322.1 


i 






i 








415.2 
















214.3 


i 




r 








587.3 




i459-470 




485-494 


505-514 


531-562 


685.1 


j396-405 


.L 






i 


54.3 


1455-462 


1472-491 


■ 517-536 








54_4 


i 




i 






1 


_.j 

i 


54.5 


1 


■i 


J 






1 


54.6 


1639-648 


'673-681 


1703-715 




723-732 


: 749-760 


j 772-788 
i 


328 1 ! ! 





45 



50 
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10 



15 



20 



ORr 




Antigenic 


Regions 


(cont) 






. 

46.1 


Region 24 


: Region 25 


Region 26 


Region 27 


Region 28 

... 




Region 29 

.... 


63.4 














174_6 












k _ . 


206.16 














267.1 








1 


1 




322.1 














415.2 















214_3 














587_3 


567-578 


584-601 


607-840 


844-854 


858-870 


^7~886~~~ 


685.1 ! j 










i 










54.4 














54.5 














54.6 


"793-802 


i81 1-826 


834-848 


866-876 


893-903 


907-918 


328.1 ' ! ! 


1 



25 



30 



35 



40 
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ORF 



4jLJ_ 
63.4 



__Antigemc _^gions__ !(cont)_ 

Region 30 Region_31 ; 



J74__6_ I 
20"6.f6 T 
267.1 ! 



322.1_ 
415^2 



214.3 



587.3 



889-911 



685.1 



— i. 



927-936 



_54_3_ 
54_4_ 
S4l5 

_54.6_ 
328.1 



925-944 



951-997 



l 



45 



SO 
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SEQUENCE LISTING 



w 



15 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME : Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

(F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly- 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 5255 

(v) COMPUTER READABLE FORM: 
25 (a) MEDIUM TYPE: Diskette, 3.50 inch, 1.4 Mb storage 

(B) COMPUTER: HP Vectra 4 86/3 3 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



20 



30 



35 



40 



45 



SO 
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(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05-JAN-1996 



(2) JNFORMATION FOR SEQ ID NO : 1 : 
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(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 



20 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
TCCATTATGA AGTCACAAGT ACTATAAGCT G CGATGTTAC CAATGTTTTT TAAAATCCCA 
GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 
r5 aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 
GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC G CAT AATATT 
TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTTTTT AGTTGCTGCA GGAGCATATA 
CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 
AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 
AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 
TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 
TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 
30 CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 
TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT TTTATCATTA 
TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 

TAACAT A AAA AAATTTACAG TTAAGAATA A AAAACGACTA GTTAAf? A A AA ATTGGAAAAT 840 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 900 
TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 
AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 
TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 
GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 12 00 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAACAACATC GATTTATCAT 1260 
TATTTGATAA ATAAAATTTT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 1320 
TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 13 80 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 144 0 



35 



40 



45 



SO 



60 
120 
180 
240 
300 
360 
420 
480 
54 0 
600 
660 
720 
780 



1020 
1080 
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TCATTTGCAA 


AGGGCGAAAT 


GGGTTCTTAC 


TGAGTTATCT 


ATTATAAAAA 


AATAAACATA 


1560 




GACTTATGAA 


AAATCTCTCA 


TAAATCTATG 


TTTAGTCATG 


aCATGTGTTA 


AATATTATTT 


1620 


5 


CGGGCGCTTC 


TTATTTATAC 


AAATCTAATT 


TAATACTTTT 


AAATACAGGT ATATTTTCgC 


1680 




GTTGCTGTTC 


TACTTCATTT 


AAGTTTAAAT 


CTACAGTCAA 


AATATCTGCG 


GATTCATTTA 


1740 


10 


ATTCTCCAAC 


TAAATCTCCA 


TTTGGGTTTA 


TAACTATCGA 


ATGACCAGCA 


TATTCTGTGT 


1800 


TACCATCGAA 


TCCAGTGCTA 


TTAGTTCCAA 


TGACAAACAT 


ATTATTTTCA 


ATTG CACGTG 


1860 




CCTTTAGTAA 


TGAATGCCAA 


TGTTGAAGAC 


GTGACATAGG 


CCATTGCGCC 


ACATAAAATG 


1920 


15 


CAATTTTAGC 


ACCACTACGA 


GCAGGATATC 


TTAATAATTC 


TGGAAAACGT 


AAAT CAT AAC 


1980 




AGATAAGTTG 


GGTCACATAA 


GTACCGTCAG 


ACAATTGAAA 


GGGTTCAGCT 


ACGTATTCGC 


2040 




CAGCGGTTAA 


AAATTCATGC 


TCTCTTAACA 


TAGGAACTAA 


ATGAACTTTG 


TCGTATTCaT 


2100 


20 


TAATCAGCTG 


GCCACTTTTA 


TTCACACTAA 


AAGCTGTATT 


AAATATTTGA 


TTGTTTCTAA 


2160 




TGTTAGAAAC 


TGACCCAGCT 


ACGATATCGA 


CTTTATATTT 


TTCAGCTAAA 


TGTTTAATAA 


2220 


25 


ATGAAAAACT 


TTGTCCTAGA 


TTATTATCTG 


CTTTTTCATT 


TAAATGCTCT 


AAATCATAGC 


2280 


CATTATTCCA 


CATTTCAGGT 


AAAACGACTA 


CATCTACTTC 


AGCATTCATA 


TITTTTTCGA 


2340 




ACCATTGCGT 


TATTTGAGTT 


T CATTTTT AG 


AACTATCTCC 


AAAAACAATC 


GGTAATTGAT 


2400 


30 


AAATTTGGAC 


TTTCATAACA 


TCACATCCTT 


GATAGATCTT 


ATATATAACT 


TACTAAAAGT 


2460 




TATGTTGAAA 


CGCAAAAAAC 


GAGCACAAGA 


CATAAAATCA 


AAGTCCTAGG 


CTCTACAAAG 


2520 




TTATATTGAC 


AGTAGTTGAT 


GGGGCCCCAA 


CATAGAGAAA 


TTGGAACACC 


AATTTCTACA 


2580 


35 


GACAATGCAA 


GTTGGGGTGG 


GCTCTAACAT 


AAAGAAATAC 


TTTTTCTTTA 


GAAATTAGTA 


2640 




TTTCTTATAC 


ATGAGTTTTA 


CTCATGTATT 


CCTATTCTTA 


AGTGCACATT 


AGCAGCGGCT 


2700 




AATGTGTAAG 


AACTACTACA 


TAATGAATAA 


CTAATGATTC 


TTTATCATTT 


CTGTCCCATT 


2760 


40 


CCTAACAATA 


TATTGATTAT 


TTTTTTATTA 


CGAAACGATC 


TTCCACTGGA 


TTAAATGTTT 


2820 




TTTCGCCAGC 


AGCTTCACGA 


ATATCACCAA 


ATGGCATTTG 


AGCAATAAGT 


TTCCAACTTT 


2880 


45 


TAGGAATATT 


AAATTCATTT 


GAAGTCATCT 


CATCAACAAG 


TGGATTATAG 


TGTTGTAATG 


2940 




AAG CACCTAT 


GCCTTTAGTA 


GCTAATGCAG 


TCCAAATTGC 


AAATTGATGC 


ATGGCATTTG 


3000 




TTTGAGTTGA 


CCATATTGCA 


AAATT AT CAT 


AGTAGTTTGG 


CATTTGTTCT 


TGTAAACCAC 


3060 


SO 


TTACAACATC 


TTGATCTTCA 


TAAAACAAAA 


TTGTACCGTA 


TGAATGTTTG 


AAGTTATCAA 


3120 




TTTTTTGTTC 


AGTTGGCTCG 


AAATCACGAT 


TCTCTCCCAT 


GACTTCTTTT 


AAAATTGCTT 


3180 




TTGTGTTATC 


CCAAAATTTA 


TTATTGTTGT 


CATTTAACAA 


GAGAACAATT 


CTAGTTGATT 


3240 
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5 



10 



25 



30 



A X v_vjV_ X Aft I 


1 uAX A X LbAA 


TCTTTCAAAT 


TATATATTGA 


ACGTCTTTCT 


TCCATTGCAT 


3360 


X \3 X V#ivvv\vj I 


Uil XV^V-I XXI 


TTATCTTTTr 


TAAATAAGCC 


CATAATTATT 


GCTCCTTCTT 


3420 




TV / ,,r r** f TV TV •T' TV 

XAV-X 1 nKlMb 


ACTAAGTATA AAATTTATAC 


TCGTACTTGT 


AAAGCAATAT 


3480 


A AAV.V7AAAA1 


♦TTPZi 7API TA TA T'R 

X 1 Lrt/ib/iA X A 


TTAATATTCA 


TTTTCAAATT 


CCAAATATAA 


ATGCATTTTC 


3540 


AAfYSf*ATATT 
AAIm>V7V»AX AX X 


Inl 1 A 1 A\w X X 


AGATTAATAC 


TTACATGAAA 


AAGGGAGGTG 


TCTCGTGAAA 


3600 


X VJ 1 LA 1H1 \_A 


1 XVjVj XXX AAvj 


AAAATGTTAC 


TTTCAACAAG 


TATTTTAATT 


TTAAGTAGTA 


3660 


# ^' i ■ TA P*» I w I w i T R P2PZ 
\3 XAU X X XAVjVJ 


P2PTTP2P"' TA 7A P»P* 


CACACAGTTG 


AAGCAAAGGA 


TAACTTAAAT 


GGAGAAAAAC 


3720 


UvlV. 1AL 1 AH 


X X XvjAA 1 LAI 


AATATAACTT 


CACCATCAGT 


AAATAGTGAA 


ATGAATAATA 


3780 


AiuAunL 1 vjvj 


PhPh ppTf» TV O 

oAvJALL. I L\At_ 


GAATCAAATC 


AAACGGGTAA 


TGAAGGAACA 


GGTTCGAATA 


3840 


rTrv'Tr a tp* p* 


T 1 TA TA TV'P m I V* TV T 1 


TCGAATAATG 


TGAAGCCAGA 


CTCAAACAAC 


CAaAACCCAA 


3900 


\a X Ak- AviA 1 1 L. 


TV TV TV TV C^/^ft /■» TV 

AAAACCAGAC 


CCAAATAACC 


AAAACTCAAG 


TCCGAATCCT 


AAACCAGATC 


3960 


w\VjA1AALLL 


P* TV TV TV P/^TV TV TV TV 

bAAACuAAAA 


CCGGATCCAA 


AACCAGACCC 


AGATAAACCA 


AAGCCAAATC 


4020 


71 TT^P* TV TV A 


ACCAGATCCA 


GATAACCCGA 


AACCAAATCC 


AGATCCAAAA 


CCAGACCCAG 


4080 


71 T TA TA TV P»P* fv f\ A 


P/^/^TV TV TV *T*r^/T+ 

GCCAAATCCG 


GATCCAAAAC 


CAGATCCAGA 


TAAACCAAAG 


CCAAATCCGA 


4140 


ATCCAAAAC C 


AGACCCTAAT 


AAGCCAAATC 


CTAACCCGTC 


ACCAGATCCC 


GATCAACCTG 


4200 


GGGATTCCAA 


TCATTCTGGT 


GGCTCGAAAA 


ATGGGGGGAC 


ATGGAACCCA 


AATGCTTCAG 


4260 


ATGGATCTAA 


TCAAGGTCAA 


TGGCAACCAA 


ATGGGAATCA 


AGGAAACTCA 


CAAAATCCTA 


4320 


CTGGTAATGA 


TTTTGTATCC 


CAACGATTTT 


TAGCCTTGGC 


AAATGGGGCT 


TACAAGTATA 


4380 



ATCCGTA TAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TT ATC5C? Af4 A A QTTACTGATQ 444 0 



40 



45 



50 



AAGACATTTA 


TAATATTATT 


CGAAAACAAa 


ATTTCAGCGG 


AAATGCATAT 


TTAAATGGAT 


4500 


TACAACAGCA 


ATCGAATTAC 


TTTAGATTCC 


aATATTTCAA 


TCCATTGAAA 


TCAGAAAGGT 


4560 


ACTATCGTAA 


TTTAGATGAA 


CAAGTACTCG 


CATTAATTAC 


TGGTGAAATT 


GGATCAATGC 


4620 


CAGATTTGAA 


AAAGCCCGAA 


GATAAGCCGG 


ATTCAAAACA 


ACGCTCATTT 


GAACCGCATG 


4680 


AAAAAGACGA 


TTTTACAGTA 


GTTAAAAAAC 


AAGAAGATAA 


TAAGAAAAGT 


GCGTCAACTG 


4740 


CATATAGTAA 


AAGTTGGCTA 


GCAATTGTAT 


GTTCTATGAT 


GGTGGTATTT 


TCAATCATGC 


4800 


TATTCTTATT 


TGTAAAGCGA 


AATAAAAAGA 


AAAATAAAAA 


CGAATCACAG 


CGACGATAAT 


4860 


CCGTGTGTGA 


TTCGTTTTTT 


TTATTATGGA 


ATAAAAATGT 


GATATATAAA 


ATTCGCTTGT 


4920 


TCCGTGGCTT 


TTTTCAAAGC 


CTCAGGATTA 


AGTAATTGGA 


ATATAACGAC 


AAATCCGTTT 


4980 


TGTAACATAT 


GGATAATAAT 


TGGAACAGCA 


AGCCGTTTTG 


TCCAAACATA 


TGCTAATGAA 


5040 



55 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

5 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 5280 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 5340 

TGATAAATCA TTACCAATG C AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 5400 

10 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 54 60 

GTCGTGG CGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

75 ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC t TGTCATAAT TTTCCTCCAA 564 0 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

20 

GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

25 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO: 2: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 60 

40 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CG CCAGATG A TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

45 

CCGATTATAC AAATTAATTT GGG AACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 300 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

50 AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 4 20 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 4 80 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 54 0 

55 
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AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

TATGGAAACG TTACTTGATA AGATTGCAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 780 

CGGTTTCTTT AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 840 

TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 900 

AAAAATGGGA CGCTATGGTA AGTT CATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATAT CTTGT 114 0 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 1200 

AGCGCAGAAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 1320 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 1380 

GTAAATGTAA TAGGTGCTGG TCTTG CCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 1440 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAG CTGG TGGTGCATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 1680 

ACTACTGAAA-CACFTAAAAA-^ EGA-TGAAAAT— ATGAGAGTTA-TTAATGAAGA-AATTAATGCC 1740 



20 



25 



35 



40 



ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 1800 

GAAATAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 1920 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 2040 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2280 

AGATATGGTG TGATGCATAG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 2340 



50 



55 
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TATGTAGAAA 


G CGCAg cT AG 


CGG CTTAGTT 


GCAGGTATCA 


ATCTTGCGCA 


TAAAATATTA 


2460 




GGCAAGGGTG 


AGGTAGTATT 


TCCGAGAGAA 


ACAATGATTG 


GAAGTATGGC 


TTACTATATT 


2520 


5 


TCTCATGCTA 


AAAACAATAA 


GAATTTCCAA 


CCTATGAATG 


CTAACTTCGG 


GTTATTACCA 


2580 




TCTTTAGAAA 


CTAGAATTAA 


AGATAAAAAA 


GAACGCTATG 


AAGCACAAGC 


TAATAGAGCT 


2640 


10 


TTGGATTACT 


TAGAAAATTT 


CAAAAAAACT 


TTATAAAATA 


GTTAGAAAGA 


CTAGATATGC 


2700 




TATTCATTCT 


TAAGTCATCA 


ACGAGTAAGT 


AATGACTTTC 


TAAATGGAAA 


ATACTTATCC 


2760 




TAGTCTTTTT 


AATTTTGGAA 


TTGTTACGTA 


TTTCTGACAA 


TTTAGAATTC 


GCATTCAAAA 


2820 


15 


AATATCTAAA 


TAAATAACAC 


GCAATAAGTT 


GATTGATGTA 


ACATGTAAGA 


GAATGTTTTA 


2880 




AATAAACTTT 


ATTTAAAAGG 


CAATGAAATA 


ATAAATGGCA 


AGGCTATTAA 


TAAAGACTTT 


2940 




TAGTAATTAA 


TTTAAAAAAG 


AGGTATTCTA 


ATTAACAGGT 


TTTCCGATTA 


GTTACAATTA 


3000 


20 


TTTAATTCTC 


AAAAGATTTA 


GAATTGATTA 


TCAAATTACT 


GTAAGCCCTT 


TGCTGTATAT 


3060 




GCTACAATTC 


TTATTGATGG 


AGGGTAAATG 


TATTGAATCA 


TATTCAAGAT 


GCGTTTTTAA 


3120 


25 


ATACATTGAA 


AGTTGAACGG 


AATTTTTCGG 


AACACACATT 


GAAATCATAT 


CAAGATGACT 


3180 


TAATTCAGTT 


TAATCAATTT 


TTAGAACAAG 


AACATTTAGA 


GTTGAATACT 


TTTGAATACA 


3240 




GAGATGCTAG 


AAATTATTTG 


AG CT ATTT AT 


ATTCAAATCA 


TTTGAAAAGA 


ACATCTGTTT 


3300 


30 


CTCGTAAAAT 


CTCAACGTTA 


AGAACTTTCT 


ATGAATATTG 


GATGACGCTT 


GATGAGAACA 


33 60 




TTATTAATCC 


ATTTGTTCAA 


TTAGTACATC 


CGAAAAAAGA 


AAAATATCTT 


CCGCAATTCT 


3420 




TTTACGAAGA 


AGAAATGGAA 


GCGTTATTCA 


AAACTGTAGA 


AGAGGACACT 


TCAAAAAATT 


3480 


35 


TACGGGATCG 


AGTTATTCTT 


GAATTGTTGT 


ATGCTACAGG 


CATC CGTGTT 


TCGGAATTAG 


3540 




TAAATATTAA 


AAAACAAGAT 


ATAGATTTTT 


ACGCGAATGG 


TGTTACCGTA 


TTAGGAAAAG 


3600 


40 


GGAGCAAAGA 


GCG CTTTGTA 


CCGTTTGGTG 


CTTATTGTAG 


ACAAAGCATC 


GAAAATTATT 


3660 


TAGAACATTT 


CAAAC CAATT 


CAGTCATGCA 


ATCATGATTT 


TCTTATTGTA 


AATATGAAGG 


3720 




GTGAAGCAAT 


CACTGAACGC 


GGTGTACGAT 


ATGTTTTAAA 


TGATATTGTT 


AAACGAACAG 


3780 


45 


CAGGCGTAAG 


TGaGATTCAT 


CCCCACAAGC 


TCAGACATAC 


ATTTGCAACG 


CATTTATTGA 


3840 




ATCAAGCj I VjU 








TrATfiTTAAT 

X X '•J X X J^r^. X 


TTGTCAACAA 


3900 




CTGGTAAATA 


TACACACGTA 


TCTAACCAAC 


AATTAAGAAA 


AGTGTATCTA 


AATGCACATC 


3960 


SO 


CTCGAGCGAA 


AAAGGAGAAT 


GAAACATGAG 


TAATACAACA 


TTACATGCAA 


CAACAATTTA 


4020 




TGCTGTAAGA 


CATAATGGGA 


AAGCAGCTAT 


GGCTGGAGAT 


GGGCAAGTAA 


CGCTTGGTCA 


4080 




ACAAGTCATC 


ATGAAACAAA 


CGGCAAGAAA 


AGTGCGACGT 


TTATATGAAG 


GTAAAGTGTT 


4140 
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ATTACAACAG TTTAGTGGTA ACTTAGAAAG AGCTGCTGTT GAATTGGCAC AAGAATGGCG 
AGGCGATAAA CAATTACGTC AATTAGAAGC TATGCTAATT GTAATGGATA AAGATGCTAT 
TTTAGTTGTC AGTGGAACTG GCGAAGTTAT TGCTCCAGAT GATGACCTTA TCGCTATTGG 
ATCAGGAGGC AACTACGCAT TAAGCGCAGG ACGTGCATTG AAACGCCATG CATCGCATTT 
GTCTGCTGAA GAAATGGCAT ATGAGAGCTT GAAAGTAGCG GCTGATATTT GTGTCTTTAC 



CAACGATAAT ATTGTTGTCG AAACACTATA ATAATCAGAG CACGATAAAT AATTACGAGC 
AATTAATTTT AGTTAAAAGA CGGAGGAATG AAATTAATGG ATACAGCTGG AATAAGATTA 
ACTCCAAAAG AAATCGTATC TAAATTAAAT GAATACATCG TTGGACAAAA TGATGCTAAA 
CGTAAAGTGG CAATTGCCCT ACGTAATCGA TACAGAAGAA GTTTATTAGA TGAGGAATCA 
AAGCAAGAAA TTTCACCTAA AAATATTTTG ATGATTGGAC CAACCGGCGT TGGTAAAACT 
GAAATTGCAA GAAGAATGGC CAAAGTTGTC GGCGCGCCAT TTATAAAAGT AGAAGCTACT 
AAATTTACTG AGGTAGGTTA TGTAGGACGA GATGTTGAAA GTATGGTTAG AGATCTTGTT 
25 GATGTTTCAG TAAGATTAGT CAAGGCGCAG AAAAAATCAT TGGTACAAGA TGAAGCAACA 
GCTAAGGCCA ATGAAAAACT TGTTAAGTTA TTAGTTCCAA GTATGAAAAA GAAAGCGTCT 
CAAACGAATA ATCCTTTAGA GTCACTTTTC GGAGGTGCAA TTCCAAATTT CGGACAAAAT 
30 AACGAAGATG AAGAAGAACC ACCTACTGAG GAAATTAAAA CAAAACGTTC TGAAATTAAG 
AGACAGCTAG AAGAAGGCAA ACTTGAAAAA GAAAAGGTAA GAATTAAAGT CGAACAAGAT 
CCTGGTGCTT TAGGTATGCT AGGTACAAAT CAAAATCAGC AAATGCAAGA GATGATGAAT 
-CAATTAATGC-CTAAAAAGAA-AGTTGAG^A-GA^^ 



4260 
4320 
4380 
4440 



5340 
5400 



TTAGCTGATA GTTATGCGGA TGAACTAATT GATCAAGAAA GCGCTAACCA AGAAGCGCTT 

4Q GAATTAGCAG AACAAATGGG TATCATCTTT ATAGATGAAA TCGACAAAGT TGCGACGAAT 54 60 

AATCATAATA GTGGTCAAGA TGTCTCAAGA CAAGGTGTTC AAAGAGATAT TTTACCTATA 5520 

CTTGAAGGTA GCGTTATTCA AACCAAATAT GGTACTGTGA ATACTGAACA TATGCTGTTT 5580 

4S ATAGGTGCTG GAGCTTTCCA TGTATCTAAG CCGAGTGACT TGATACCAGA ATTGCAAGGT 564 0 

CGTTTTCCGA TTAGAGTTGA ACTTGATAGT TTATCGGTAG AAGATTTTGT AAGAATTTTG 5700 

ACAGAACCAA AATTGTCATT AATTAAACAA TATGAAGCAT TGCTTCAAAC AGAAGAAGTT 5760 

ACTGTAAACT TTACCGATGA AGCAATTACT CGCTTAGCTG AGATTGCTTA TCAAGTAAAT 5820 
CAAGATACAG ACAACATTGG TGCACGTCGA CTTCATACAA TTTTAGAAAA GATGCTAGAA 



SO 



5880 



55 



GATTTATCAT TCGAAGCACC AAGTATGCCG AATGCAGTTG TAGATATTAC CCCACAATAT 594 0 
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10 



AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAACGAGAGA GTTAAACACG 6060 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 612 0 

AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 6180 

CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 624 0 

AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC 63 00 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 63 6 0 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 6420 

75 GATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 64 80 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 654 0 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 6600 

GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 6660 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 6720 

CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 6780 

TTAGAAAAAA GTAAAT 6796. 
(2) INFORMATION FOR SEQ ID NO: 3: 



20 



25 



30 



35 



40 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

ATCCTAAAAT TnAAAATTAT CACGCCTTTT GaACAGCTTT GTAACCaTCt GGACGATCAT 60 

kAAATTCCaA TGTAAATCCT GGTTTAAaGT TGATCTTTAA CCTTATTTAA AyCACCAATT 120 

GTACGTATAT TATGTTGTTT AGCAAAATCA CGTTTTACAG CT AAAG CAT A CGTATTGTTA 180 

TACTTCATTG GTTTTAACAT AGTCATTTGA TATTTCTTTT CAAGACTTTG CTTAG CTTGT 24 0 

TCATAAACTT TTTTCTCTTC TTTTGACTTC AATGGTTCTT TTGTTAATTC ACCTAAAACT 300 

GTTCCAGTAA ATTCTAAATA C C CAT CT AT A TCGTCAGATT TTAAAGCATT AAATAAAAAT 3 60 

50 GCTGTTTTGC CCATACCATC TTTCACTTCT ACAGTATTTT TGGTCTCTTC TTCTATTAAA 420 

ATTTTATACA TATTTGTAAT AAT CGATGGC TCGGAGCCAA GCTTTCCAGC TAACGTAATT 4 80 

TTATCACCTT TTTGTGCAAA CATAGGAATA GCGATAGCCA GTATAATAAT CATCACTATA 54 0 

55 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG " 660 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATG CTGCT 720 

AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 84 0 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 900 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 960 

GGAATCATTA AACCTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 102 0 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 108 0 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 114 0 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 1200 

20 TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 1260 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 1320 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA AC CT CAT CG A TTGGTTTCAT 13 8 0 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 144 0 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCG C 156 0 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTCTAAT GCACTAAACG GTTCATCCAT 1680 

35 TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 1740 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 1800 

AAGTRATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 1860 

TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 192 0 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 1980 

AATATAACCT TCACTTAAGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 204 0 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 2073 
(2) INFORMATION FOR SEQ ID NO: 4: 

SO (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 60 

AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 120 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

T A CAT CTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 24 0 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 3 00 

AATAG CATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 3 60 

GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 4 20 

TCATAAATG C CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 4 80 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 540 

20 CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 6 60 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT C CAAACAT AG 720 

ATGATCATCA TTTTTAAAAG ATTAG CGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 7 80 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 840 

ACTAGTGTTC TTTTTTAGCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 9 60 

AG CCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 1020 

35 ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 1080 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 1140 

TACC&AACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 1200 

TTGCAACAAC CATTCAATAC CAC CATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 1380 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 144 0 

TACAACTTTA ATTAGATTAT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 1500 

SO TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCAC CTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTT CATTT TCGTATGTTG TGTTCTAACC GCAG CAATTG TTGTATCTGA 1620 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGCtAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT I860 

5 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 192 0 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 198 0 

w ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 2 04 0 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 210 0 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 216 0 

AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 2220 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 228 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 2 34 0 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 2400 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 24 6 0 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 2520 

AATTGCTGAA CACGTTG CTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 2 580 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 2640 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATGATA 2 76 0 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2820 

AATGTGAATA-CTATATA-TAA-^ 3?GGTAATTTT— < TGTTeAGGAA-AAAeAGTeGe-TATTCCAAAA 2880~ 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 2 940 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 3000 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3 060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 318 0 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 324 0 

AAACTTATAA TCCACACCCT GAG CAAACGC TCCTTATGAC AGAGTATTAA AATAAG CCGA 33 0 0 

50 TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 336 0 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 34 20 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TG ATT CTT CT 34 8 0 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 3 600 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC TAATTCTTCT 3 660 

GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3 720 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3 780 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3 84 0 

GTAAATGCCT CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG TCCAGCAACT 3 900 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 3 960 

AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT TTCTTCAATC 4 020 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT 4 080 

GCACCTTCTT TAG CAT ACG C AATTGCTGCT GCACGCCCTA TTGCTGAGTC AC CACCTGTG 4140 

20 ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTC GCCACAATCG 4 200 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC 4 260 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 4 3 20 

TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA 4 3 80 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT ACAAATTCAA 4 440 

CAAAATACTC AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA 4 500 

AATCATATTC ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT TGACTATAAC 4 560 

CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA TATCCAACAT 4 62 0 

35 GTTGGGCTAC TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA GCGCCAGGTA 4 680 

AATG CG TATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 4 740 

CTTGJAATCG TGTTTCTACA AG CTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4 800 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4 860 

TTACTACCGC TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 920 

TCAATTGCGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4 980 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC C CTGTTTTG A 504 0 

TTG CAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

50 CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAG CGCCATG TCCTAGTTCT TCTGCTGATT 516 0 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG 5220 

TAG CCAAAAT ACT AG CCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 528 0 
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CAGCTATACA ACTCAGACCT 


TGTC CCACTT 


CAGCAACAAG 


CCCAGTCGCA 


. AGTGGTAAGT 


5400 


5 


CTAATATTCT 


AATATGATGT 


TCTGTTAAAA 


TATCTTTAAT 


TTTTTGTGTA 


GTCTTAAATT 


5460 




CTTTATCGGA 


TAGTTCTGGA 


AATTGATGAA 


AATACCTTCT 


CCAGGTAACA 


GCTTGATCTT 


5520 




TTAATCCCAT 


CGGTCATTCC 


CCTTCCTTAA 


GTCAATGATA 


TGTTGTCTAC 


CCTACGATGA 


5580 


10 


TCATCTTTGA 


CT ATT AAA CG 


ATGATTTCAC 


AACAATGTAC 


TCTTGTTAAT 


TGCTTTCGTT 


5640 




AATGATAGAC 


AGTTGTTTAA 


TAATATCGTA 


ACACTGTTGT 


CAAACTATTC 


TAACTTTTAT 


5700 




AATTGAGACT 


CTATACAAAA 


ACGTGTTCTC 


GAATATACTT 


GTTTTTACAA 


ACCACAAAAA 


5760 


IS 


GCTCTAAACA 


TTAGTTTAAA 


CCAATGCTTA 


GAGCTTTCTA 


ATTATTTTAT 


GCTTTAAAAG 


5820 




ATACTGTGTT 


ATCTACGATG 


ACCTTACCGT 


CTTTAATAAC 


TTTTTCTG CG 


TGATTGATAC 


5880 




CAAAATGATA 


TGGAATATAT 


TCATGATTTG 


GTGCATCCCA 


AATTACTAAA 


TTAGC CTTAT 


5940 


20 


CACCTGTGTT 


AATTGTACCC 


GCGTTAATGT 


CTATTGCTTT 


AGCAGCATTG 


ACCGTAACAG 


6000 




CATTCCAAAC 


TTCATTAGGT 


GATAGCTTTA 


ATTTCAAGGC 


TGCAATCGCC 


ATAACAAGTT 


6060 


25 


GTAAGTTGTT 


TGTGACACTA 


CTACCAGGGT 


TATAATCAGT 


TGCTAATGCA 


ATCG CACCGT 


6120 




TATTGTCAAG 


CATG CCTCTT 


G CATCTG CAT 


AATCTTCTTT 


ACCTAAATAG 


AACGTCGTTG 


6180 




CAGGTAAGAG 


GACAGCTACA 


GTATCACTAT 


TTCGCAACTT 


TTCTTTTCCT 


TTATCACTAG 


6240 


30 


AAGCTACTAA 


GTGGTCTGCT 


GATATTGCTT 


GTTCATCAAT 


TGCTAATTCC 


AGTCCGCCTA 


6300 




ACGGATCAAT 


TTCATCCGCA 


TGTATTTTCA 


CTTTAAAACC 


TGCTTCTTTG 


gctttttgca 


6360 




TATAATGTTG 


CGATTGTTCT 


ATTGTAAATA 


CACCTGTTTC 


ACAGAAAATA 


TCCGCAAAGT 


6420 


35 


CTGCATAT.TG TJirrACTTCC 


GGAAGTAAGG- 


-GAATGATTTG- 


-TTCTAAAAAT- 


GCCTCATTTG 


6-4-80 




AACTTGCCTC 


TTTAGGTACA 


G CATG AGG CC 


CTAGGAAAGT 


ATGTTTCATG 


TCTAAAT CAT 


6540 


40 


Al-ITLTCAGC 


TAAACGATTA 


GACACTTTCA 


ATTGCTTCAG 


TTCATTTTCT 


CTATCTAATC 


6600 




CATAACCACT 


CTTACTTTCA 


ACTGCAAGCA 


CGCCGTGTTT 


AATCATAGTA 


AGCAAATCAT 


6660 




GCTCTGCTTT 


TTTAAACAAG 


TCATCTTCGG 


ATGTTTCTCT 


AGTAGCATTA 


ACGGTAGATA 


6720 


45 


ATATGCCACC 


ACCCATTTCT 


AATATTTCAA 


GGTAAGACTT 


ACCTTGACGT 


TTTAATGACA 


6780 




TCTCATGTTC 


TCGAGATCCA 


CCAAATGTTA 


AATGGGTATG 


TGCATCTACT 


AATGCTGGGG 


6840 




ACACT AC CTT 


CCCACTAGCA 


TCAATCGTCT 


CAGTCGCATC 


GTAGTCATCT 


GTATGTGTTC 


6900 


SO 


CAGCATATAC 


AATTTTGCCA 


TCTTTAATGA 


CAACTGTACC 


ATTTTTCACA 


ACATTTAATT 


6960 




CATCTAATTC 


CTTACCCTTC 


AAAGGTTTAT 


CTGTTGATCT 


CGGTAAAATT 


AATTCTGCTA 


7020 




TATGATTAAT 


TATTAAATCA 


TTCATTACTT 


ATCACCTGCT 


TTATCAATCA 


TTGGAATATG 


7080 
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AACACCCATA 


CCTGGGTCAG 


TCGTCAATAC 


ACGTTCCAAT 


CTTCTTTCAG 


CACGCTCTGA 


7200 




TCCATCTGCT 


ACAACAACCA 


TACCCGCATG 


AAGTGAATAT 


CCCATGCCAA 


CACCGCCACC 


7260 


5 


GTGATGGAAT 


GAAATCCATG 


AACCACCTGC 


AGCTGTGTTA 


ATGAGTGCAT 


TCAATACAGC 


7320 




CCAATCACCA 


ACCGCGTCAC 


T AC CATCTTT 


CATACTTTCT 


GTTTCACGGT 


TAGGACTAGC 


7380 


10 


AACTGAACCA 


GCATCTAAAT 


GGTCTCGTCC 


AATAACAATT 


GGTGCTGAAA 


TTTCACCGTC 


7440 


ACGTACAAGA 


CGATTTAAAG 


CTAAGCCCAT 


TTTCGCTCTT 


TCTCCATAGC 


CTAACCAAGC 


7500 




AATACGTGAT 


GGTAGTCCTT 


GATATGAAAT 


TTTTTCTTCA 


GCTAAATCAA 


GCCATCTTAA 


7560 


15 


TAACTTTTCA 


TTTTCTGGGA 


AAAGTTTGCG 


CATTTCTTCA 


TCCGCACGCT 


CGATATCTTT 


7620 




TGGATCACCA 


CTCAACGCAG 


CAAAGCGGAA 


TGGCCCTTTA 


CCTTCACAGA 


ATAATGGTCT 


7680 




AATGTAAGCT 


GGTACAAAGC 


CTGGGAAGTC 


AAAAGCATTT 


TTCACTCCGT 


TATTGAAGGC 


7740 


20 


TACTTGACGA 


ATATTGTTAC 


CATAATCAAA 


TGCTACAGCG 


CCACGTTTTT 


GGAATTCAAG 


7800 




CATTAATTCA 


ACATGCTTTG 


CCATTGAAGC 


TTGTGACAGT 


TCAACATATT 


TTTTCGGATC 


7860 


25 


TTTTTCACGC 


AATACTTTCG 


CTTCTTCTAC 


AGAGTATCCT 


TGTGGCACAT 


AT CCATTTAG 


7920 


CGGATCATGT 


GCACTTGTTT 


GGTCAGTAAT 


AATGTCAATT 


TTAAATCCTT 


TTTCTAGAAT 


7980 




CGCTTGATGG 


ATGTCTACAG 


CATTTCCAAC 


TAACCCGATT 


GATAATCCTT 


CTCCACGTTC 


8040 


30 


TTTCGCCTCT 


TCTGCTAATT 


TTAATGCTTC 


AT CTAAATCA 


GCTGTTTTAA 


CATCACAGTA 


8100 




TTTCGTATCA 


ATTCGCTTAT 


CAACACGTGT 


TTCATCAACA 


TCCACGCAAA 


TTGCTACCCC 


8160 




ATG ATT CAT A 


GTAATTG CT A 


ACGGTTGCGC 


ACCACCCATA 


CCACCTAAAC 


CTGCTGTCAG 


8220 


35 


TGTAACAGTG 


CCTGCTAAAT 


CTCCATTAAA 


GTGTTGATTA 


CCTAGCTCGG 


CAAATGTCTC 


8280 




ATAAGTACCT 


TGCACAATAC 


CTTGAGAACC 


AATATATATC 


CAACTACCGG 


CTGTCATCTG 


8340 




TCCATACATG 


ATTAAACCTT 


TTTTATCTAA 


TTCATTAAAA 


TGATCCCAGT 


TTGCCCATTC 


8400 


40 


AGGCACTAAT 


ACTGAATTTG 


AAATTAATAC 


ACGTGGCGCT 


TCTTCATGTG 


TTTTAAATAC 


8460 




AGCAACTGGC 


TTTCCTGATT 


GTACTAACAT 


TGTCTCATCT 


GATTCTAATT 


CTCGTAACGT 


8520 


45 


TTTCTCTATT 


GCTTCAAAAG 


CTTCCCAATT 


ACGTGCTGCT 


TTTCCAATAC 


CACCATAAAC 


8580 




AACTAAATCT 


TCTGGTCTTT 


CAGCAACTTC 


TGGGTCTAAA 


TTGTTGTATA 


ACATTCTAAG 


8640 




TACTG CTTCT 


TGTTCCCAAC 


CTTTACACTC 


AATACTCAAA 


ccmrrrrG 


CTTGAATTTT 


8700 


50 


TCTCATAAAA 


TTCGCTCCTG 


TTCTTTTAAG 


AAGTTAATTC 


CACTAAATTT 


AAAACGCTTA 


8760 




CATTATTATC 


TTCAATATTC 


ATTATAGTAT 


GTTAAAATAT 


AGCCAACAAA 


TATAAATAAA 


8820 




CTAATTATCC 


AT AG CTTGAA T CT ATAAAT A AAAGGAGCAA AACACATGAA AATTATTCAG 


8880 
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w 



15 



20 



25 



30 



35 



40 



45 



CATATTAGCC 
GACTTATTTA 
CGTTATGCGA 
AGCGTTACAT 
GCGAATTTAA 
ATACATGATA 
ACAAATGAAA 
ATTTTATTAG 
CCACTCATAT 
AGAAGAAATA 
TTTGTTCATC 
ACGTCTAATT 
TACCATAAAA 
TATTTATATG 
CTCAGTCAAC 
CTCAGTCAAC 
CTCAGTCAAC 
CTCAGTCAAC 
"CGTAGTCAaC" 



so 



CGCAGATCAT 
TTATAAAAAT 
TTCAGTCAGA 
GCTATTGACT 
TGAAGAAAAA 
TATAGATGAC 
TTTAGAAGGA 
TGGTCATAAG 
GGCTAAACCA 
CCGTTAAAAT 



55 



AGCCATCTTT 

CACGTTCAAC 

GCGAATTAGT 

CAGAACCAAG 

TTCGAAAGCA 

AACATCAATC 

AAATAAC CCA 

CACCCAAGGA 

TACCAAACAA 

TTCGTCCAAA 

TCGGCTTAGG 

TAGAATATAA 

AACGCAAACA 

GACTTTTAGA 

TGTATAC CTT 

TGTATAC CTT 

TGTATAC CTT 

TGTATAC CTT 

TGTaTAC'CTT" 

CGTATAAAAA 

TCTAAGAAAG 

AATTTAAACG 

GGTAAAAAAA 

GATATACCAC 

AGCGAATTTA 

AGAGTTAGAG 

CTAGAATTAC 

CATATGACAT 

TTTAATTAAT 



AACTG CTACG 

AAAAGACATC 

TCAACAATAT 

GATAAAAATT 

CCATTCCGAC 

TAT AG AG CAA 

CGAAGATATA 

AACATTTAAA 

AAATTCTCAA 

TGTCGTTGTA 

TTACG CTATC 

AAAAATTCGT 

CTCCGAACAA 

GGCTCTTTAA 

TTGCCTTTAA 

TTGCCTTTAA 

TTTCCTTTAA 

TTGCCTTTAA 

TT^CCTTTXA" 

TTAATGACGT 

AAGTGAAGCA 

ATTCAATACA 

CTGCTTATTT 

GTAATGAAAT 

AATATTGGCA 

ATATTAGAGA 

ATACTGGCAC 

TTTACAAATA 

TATTATATAA 



ATTAAAAAAA 
AAGATTACCG 
CGATCCACGA 
GGGACTCTTG 
TACCCTGAAC 
TTACTGAATT 
AGATCCATTC 
AATCAAAATT 
GTGCGCAAAC 
GAAACAGATC 
ATTCCGAGAT 
CCAAACTTAG 
GTACATACAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 



CTTAAGTTAT 
CATTTCAAAA 
GATGTTAAAA 
TTTTTATAGA 
TGAGCTTGCA 
TCACTTTTCA 
TCAGAGGTTA 
TAGACAATCA 
ACTTGAGAAC 
AGGTGTCATT 
TAAGAGAACT 



TGGAAGCAGA 
AAAAAGGAAT 
TGGAAAAAAT 
AATCTACGAA 
AGCAATATCG 
TTAATATTCA 
CTTTATATGA 
GGGTAGATGT 
ACTTAGATGA 
GATTCGAATC 
TTTATTACCA 
GCCGAAAAAT 
TCGTACAACA 
TAGAG CCTCT 
TAGAG CCTCT 
TAGAGCCTCT 
TAGAGCCTCT 
TAGTGCCTCT 
TAGAGCCTCT 
ATCGATACAA 
TCTATTAATC 
GATATTTTAC 
GGCCTATGGA 
TATACACATA 
AAAGATAATA 
ATTTACTTTA 
AGATTAAATT 
ATAAAAAGGC 
TTTCAAACAA 



TTTAGGTTAT 
ACAGTTTTAT 
GTATGATTTA 
TCAATGGATT 
TTTATATGAA 
TTT AG CT AT A 
GGAATCTTAC 
TGAAAATTTG 
CTATTTTAAT 
AGCAGTTGGA 
ATCATTTCAC 
TTATATCAAT 
ATGCCAAGAT 
TATGCAGTTG 
TATGCAGTTG 
TATGCAGTTG 
TATGCAGTTG 
TATGTAGTTG 



TATGCAGTTG 
AAATAATTTA 
ATATATGCTT 
TTGGGAAATT 
TTGCTTTAAA 
TAGCTTTCAC 
ACGTGAATAT 
CCGACCCTGA 
ATTATAAAGA 
CTCTTGAACT 
TACAGTTGTT 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 



10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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TTACTGCAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 10800 

CATTCTGCCC TGCAATGCAA ATCTCGTCAC ATATAAATAT TTTTAATTAT TTTAAAAAAT 10860 

GATGCACTAA ATTAGCAACG AGCTTAGCAG TTCTATTGTC AGCGTCATAT GTTGGATTCA 10920 

TCTCAGCAAT ACTAACTGAA GACACCTTAT CACTTGGAAT AATACGTTTT GCTAATTCAA 10980 

GAACAGTATG TGGATACAAA CCTAACACTG CCGGCGCACT TACCCCAGGC GCAAACGCAC 11040 

TATCAATGAC ATCCATACAA ATCGTAAACA TAATGACATC ATGTTCATGT ACAAAACGTT 11100 

CAAT CAT AT C TTTAATTGTT GGTGATACGT GACTCAATAA TTCATCTGCA AAGACATAAT 1116 0 

is CAATCTTTTT CT CTTT AGCA TAATCAAATA AACTTTGCGT ATTACCACCT TGAGCAATAC 1122 0 

CAAGCACTAA ATAATCTGTG TTTTCATCTT CTTCTAAAAT TTGTCTAAAG CTCGTTC GAG 11280 

ATGTAGATTG TTGTTCAGCA CGTGTATCAA AATGCGCATC AATATTTATC ACACCAATAG 1134 0 

ATTGTGTTGG ATAGACTTTA CGTGTTGCTA AATATTGAGC ATACGCAATA TCATGTCCAC 114 00 

CACCTAATAA AAATGTTTGT CTATGATTAG CAATTGACTT CGCTGCAAGC ATAGCAAATT 114 60 

CTTTTTGAGT ATCAATTAAT TCCTCATGAT CATGATAAAC ATTTCCGTAA TCGACTAAAG 11520 

TTcACATTGA TTCAAATCCG GCAAACCTGC AAATGCTTGT TTAAT CGCAT CTGGTCCTTC 11560 

TTTTGCACCA ATGCGCCCCT TGTTTAAAGC AACAC CTTTG TCAACAGCAT AGC CTAAT AT 11640 

30 ACCGACCCCT GATGGCATAC TACT CTTTT C CAGCTTAGAC AAATCTT CAA ATGTTACTGT 117 00 

TTGAAAATGT CTAAATTTTT TCGGGTCTGT TTCACTATCT AACCTTCCAG TCCATAAATT 11760 

TGGTTCACCT TGCTTGTACA CAGCATTTCC CCCTCTTATT TATGTGG CTT ATTAAGAATT 11820 

35 AAAGTATAAC GTATAGGAAA TTTTGAATTC AATTCATAGT TAAATCCGTA TCTTAAAAAT 118 80 

ACTTATCTAC ATTA C TTTTA CCCCTATTTT CTATGTAATA ACGAATACTT AGCTGATTTA 11940 

TGTTAATAAA ATACGTCAAG ACTATTACAT TTTCATTAAT ATTGACATAG ACAATTTATC 12 000 

TCTCGGCTTG TAATATGTAT AATTGTTACT AAAAGATATT TTGCTTGTTA CCTAATGGAG 1206 0 

GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA CGATTTTCGT 12120 

CTTCTTTGTC ATTACAATGG CGTTATCCAT TATGCTCAGA GATTTCCAGT CTATAATTGG 12180 

TGTCAAACAC TTTATATTTG AAGTTACAGA TCTAGCACCA TTAATTGCTG CAATCATTTG 1224 0 

TATACTCGTT TTCAAATATA AAAAGGTCCA ACTTGCAGGT TTAAAATTCT CAATCAGCCT 12300 

50 GAAAGTAATT GAACGTCTAT TGCTAGCTTT AATTTTACCT TTAATTATTC TAATTATTGG 12360 

TATGTACAGC TTTAATACAT TTGCAGATAG CTTTATTTTA TTACAATCAA CAGG CTTATC 12420 

AGTACCTATT ACACACATTC TGATTGGACA TATTCTGATG GCGTTCGTAG TAGAATTCGG 124 80 
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TGTTGTTGGT TTGATGTATT CAGTTTTCTC AGCAAATACA ACTTATGGTA CAGAATTTGC 12600 

TGCTTATAAC TTCCTTTATA CATTCTCATT CTCTATGATT CTTGGTGAAT TAATTAG AG C 12660 

GACTAAAGGA CGTACAATTT ATATTGCAAC GACATTCCAT GCTTCAATGA CATTCGGACT 12720 

TATTTTCTTG TTTAGCGAAG AAATCGGCGA TCTATTTTCA ATCAAAGTCA TCGCCATTTC 12780 

AACAGCAATC GTTGCAGTAG GATACATTGG TTTAAGCTTA ATTATCCGAG GTATTGCATA 12840 

TTTAACAACA AGACGAAACC TTGAAGAACT TGAG CCTAAT AATTATTTAG ACCATGTCAA 12 90 0 

TGACGATGAA GAAACTAATC ATACTGAGGC TGAAAAATCT TCTTCAAATA TTAAAGATGC 12 960 

15 TGAAAAAACA GGTGTAGCTA CTGCATCAAC GGTTGGTGTT GCTAAAAATG ATACTGAAAA 13 020 

TACAGTGGCT GACGAACCAA GCATTCATGA AGGTACTGAA AAAACAGAAC CTCAACATCA 13 080 

CATAGGTAAT CAAACTGAAT CTAATCATGA TGAAGATCAt GACATCACTT CGGAGTCAGT 1314 0 

AGAATCAGCm GaATCAGTTA AACAAGCACC ACmAAGTGAC gATTTaACAA ACGATTCAAA 13200 

TGAAGATGAA ATAGAGCAAT CATTAnAAGA ACCTGCGACT TATAAAGAAG ACAGACGTnC 13 26 0 

ATCAGTTGTA ATTGATGCAG AAAAACATAT CGAAAAAGCT GAAGAnCAAT CTTCAGATAA 13 320 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 60 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 120 

ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

45 AGCCATTACA AACAACTTCA AACTG TTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 24 0 

TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 300 

AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

50 ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 420 

AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 480 

AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 54 0 
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AAACAGGACT CCACATAAAA ATCAACTCCT TT AT AT AC CA TAATGATACT ATATTTTCTA 660 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 72 0 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 7 BO 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 84 0 

ATTG TTTTT A AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 900 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 96 0 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AACAATTTCT 102 0 

75 TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 1080 

TTGTCAAATT CTAAAATTAT AG CCCGTAAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 1140 

GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 1200 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTT CGTG AT TATCAAAACT TAAATTCATC 1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACGAACAT TTTCTAAAAT TTTCTTGTTT 132 0 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGC CATTTCA 13 8 0 

ACAGCCTTTT CTTTCCTTTT ATAC CTTTTG TTAAATTTAT GAACCACCGT TGCAG CATAA 144 0 

TACGATATCC CACCAGATAA AATAGATGaT ATT AT CGGT A TGTATATATC ACCTTTCATA 150 0 

30 TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ACTTGCGCAA AAAGATGTAG 156 0 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG G CGTAg ACT A 1620 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG GCAAGGAAGA AGTCTCCTGC 1680 

GGGACCAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 174 0 

AAGTAAGTAT AACATAAAAA GCACCCCGTA AACTGTTATA CGGGAATGCT AAAGTCATAT 1800 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACCCAGTGAC 1860 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 1920 

AAACTAGCCC GAAGcTAGCT ATAACATAAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

45 TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAA CATC CAT 2040 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTTATTC 2100 

TAAGTAAATA CTAAATCGTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 2160 

50 CTACCTCTCC TGTCGCTATA TAACGACGTT GTC CACTATT AGCAATATAA GTAATCCATC 2220 

TATAGCCATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2340 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 24 6 0 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG T CAT AGTTTT 252 0 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 258 0 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 264 0 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 2 70 0 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTG CACT TTCTCCTGCT GCGTCTAAAT 276 0 

GAATCTCTAG AACAATGTCA TACC CATGTG ATTTAACCCA ATATAAGCCA TAATCTTTAT 2820 

is TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT AT CTTGTG AT TGACTTGAGC 2 880 

CAC CAT AT AA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2 94 0 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3 000 

AACCATGACC GG CTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 3 060 

TTGCTTGCTT AATAACGCTT TTAGCTTTAT CTCCAACACT TACTTTATCT GGG AAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 31 BO 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 324 0 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 33 00 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3 360 

TTATTTGCCC GTATTTTTCA AT CCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 3420 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 34 80 

35 GAAATCCATA-AAAe^AATeA-GGATTG^ 3540" 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACC CTAAATCATT TGTGTCGTTC 3600 

ATATTCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 3660 

TTTTCAGCTT TATATTTCTT TAG CTTTTG A TTTGCCCATT TACCTTCTTG AGATGTTGGA 3720 

TTATCTTTAT ATGTAGTATA TAAAGCAACA ACTGTTAAGA TAATCGATGA AACACTTTCT 3780 

TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATGCTAAG 3 840 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTATCCAAA 3900 

ATAAAAAGCC AGTGC CGAAG CACTGACTCT TAACTATTAC TTACACTTAC TAAACCAGAA 3 960 

60 ACACGACCAA AAGCTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT CCTTTAAATG 4 020 

CCAAAAATAG TTTTTAACAA GG CTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 4080 

AGAATCCACA TCTTGATGTC TCTAATATTT TTAGCATTTT TCTCTTTATT TTTTTCATCT 414 0 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 4260 

TCTAATCGCG TTAAACGCCA ATCTTGTTCG TGTCGTTTGG TAAATCCAAA CATTACACCA 432 0 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 4 380 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT ATCTACATAC 444 0 

CACGCTATAT CTTCTTTACT ATATTCTTTC AATTGATACC ATGTTTTAAT AT CTTCGAAT 4 500 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4 56 0 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTGCTTA TTAACTTGCA TCGATAACTT 4 62 0 

15 TGTACTTTGA ACAACTTGTT TCTG CAT ACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4680 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4 74 0 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4 800 

TTGAGAGAAA TTTTCTGGTA AATTTTCAAT ATCAATACCT TCTTCAAAGC CACCAATGAT 4 86 0 

AGCGTATGAA ATT AT CT CAT TACG CTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 4 920 

ATAATTTTGT TAATTGTCCC TCTATTTGCG TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 980 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 504 0 

TCATAAACAC CTCCACCATT TCCATCACCA T CTGGAAG AT TTGAGGGATT CAATGAAATC 5100 

30 TTTCCTCCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CAC CTGGAAA AGTCCCATAA 516 0 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 522 0 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ACTTGTCATC TGCGTATAAT 52 8 0 

TTAG CGTT AC TTTCGGCCAT ATTAGCTTTT GATTGGGCAC TTTGAACAGT TTCAAAAGGT 534 0 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TTCTTTTCTT 54 00 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 54 60 

ACTACTAAAT ATT CGACAGT ACCGTTAGTA CTAACACCTC TTGGATAATT TATAGCTTGC 5520 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 5580 

CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAG CTTTTTG CCAATTTGAA 564 0 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 57 00 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 576 0 

AAATCAGATA TATGGCTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 5820 

ATTTGAATAT CTGAT AG AC C TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 5880 

CCATTTTTTA TAGCCTCGTC CATTGCTTTC GCACGATCCA TAATAGTTTT TTCTAATTC C 5940 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6060 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 624 0 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 63 00 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 63 60 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 6420 

75 TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 64 80 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 654 0 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 66 00 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 66 60 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 6780 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6 900 

TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6 96 0 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 702 0 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 7080 

^ TC3UW3CITT-TTTACCA^ 7-140- 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCAT AG TCAG CATTAACCTG ATACGCTTCT 72 00 

TCTCCT G TTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 7260 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 7320 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 7380 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 74 4 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7560 

50 TTAGGCGTAT ACTTGAAACG AACTAATGTA TTCTCATTAT TACCATTTAA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 7680 

ACAATCAATG AGCTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CATATAGCCA 7740 
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ATT ACTG CAT TTGTAAgAGG TGCAAGTTCT GTCACAAATA AAAATTCTTG CTTATCAGGT 7860 

TCAAAACGAT ACTCGATATC AAGAATTTCT TGTTTGGTCT TATTTAATTC TCTTATAGTT 7920 

TCCTCTTTAT TAATTTGAGT TTTGGTTTCC CAATCGTCTA AATGTTCTTT TAATGTGTCA 7 980 

AAGGTTTCGC CGTTTACATT AACTCGAGCT TGAACAATCT CATTAGCACT GTTATTACGT 804 0 

GGTGCCACAA CAAGTGCGTT AATTTGACTT TGTAAAGATT TGTTTACTGC TGCTTGCGAT 8100 

CT AC CATT AT AATAAATTTG CTCAGCGAAG TGTTGAATTG TTTTAGCTyT CTGATGCAAC 816 0 

TTAAACTCTG TTGTCAAGCC AAGCGCAAAT TGCTCTATTC TTTGTAAGTT TTGTATTTCC 8220 

15 TT AG CT CT AT AATCTCGACC TGCTAAAGCT CCCAAATCCT TTATTAAATA CAAATTTTCC 8280 

ATAATGCACC TTCCTTTCTA ATAAAATAGC ACTGTACCAA GTTTCCCACT ATCGTCAACT 8340 

GTTATTTTCC ACAATTTACC GTTTGGGGAT TTCTGTACAA TGCTATTTTG AATAATTgcC 84 00 

TGctTCGCCT ATTTTTAAAT TAT CT AATTT ATTTkTATCA TTTACCGAAA TGATACCGTC 84 60 

TTGAGGCAAT C CAT CAAT An CACTACTGCC TG CAT AAGGT ATCCCATTTA TAGCTTTCCA 8 52 0 

ATGTGT AG CT GGAAAGTACT GTTTATCGT 8 54 9 
(2) INFORMATION FOR SEQ ID NO: 6: 



20 



2S 



{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAG AC CAAAA 60 

AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT GGGGGCCCTA 120 

GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 180 

GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGG CAG AT CATTATTAAA 24 0 

45 AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT GATTTTGCAA TTACATTAAA 300 

AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 360 

ATTCGAAAAG AATTCGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 420 

50 TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 480 

GG AT ACTG CG AAAGTATTAG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 540 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTACCG GTGTGGAATG GATTAACAGA 600 
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TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTG CTATGT TAGGTGTTAA TGTAAGAATT TGTACAC CT A AATCATTAAA 780 

TCCAAAAGAG GCATATGTTG AT ATTG C AAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 84 0 

CATGATTACG GATAATATTG CAGArcCAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 96 0 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 102 0 

TT AC CAG CAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 108 0 

15 TTAGCTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 1200 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 12 60 

CCT CAT ATTG GTATTAAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 13 20 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 80 

ATGTCTGATA TGGGCGGTAA AGC CGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 14 4 0 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 1500 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 1560 

30 AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 1620 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 168 0 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 174 0 

^ GGAGGATTTA--TGAATAGTAT— TGTTAeTGTT-GeAAAGTTAA-TAeeGATTTT— AeTTGTAATC 1800- 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860 

TCAGAGGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

CT AG TGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC T AGT AG AG CT 1980 

nAAAATGAGA AAGATGTAGG TAGTGC CACG GTTATAGGAC TTATATCAGT TTTAATTATC 2040 

TATyTCTTAT TAACTGTATT AG CTCAAGGC GTGATTTTGC AAAATCATAT TTCG CAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG ATCTACACTT 2160 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 2220 

50 GGTGAATTAC CTTTCATTGT TGGAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 2280 

AATAAAAATG GAGCACCTGT AAATGCACTG CTT ATT AC CA ATATATTAGT ACAATTATTT 234 0 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 24 00 
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CGACAG CAAG CAACTACTAA ACAATGGACG ATTGGTATCA TAGC CTCAAT TTATGCTATA 252 0 

TGGCTTATAT ATGCAGCAGG TATCAATTAC TTATTATTGA CGATGTTACT TTATATTCCA 258 0 

GCTCTTCTTG TTTATACaAT CGkTCmAAAG rATwATCAGa CACGTTTGAT TAAATCAGrC 264 0 

TATATTC t TT TTATGATTAT tATCGTACTT GCAGTTATCG GGTTAATTAA GTTATTGATG 2700 

GGAACGATAA ATGTTTTTTA AAAGG AG CG A CAAAAATATG AAAGAGAAAA TTGTCATTGC 276 0 

ATTAGG CGGT AATGCGATAC AGACAACAGA AGCAACAGCT GAAGCACAAC AAACAGCTAT 282 0 

TAGATGTGCG ATGCAAAACC TTAAACCTTT ATTTGATTCA CCAGCGCGTA TTGTCATTTC 288 0 

15 ACATGGTAAT GGTCCACAAA TTGGAAGTTT ATTAATCCAA CAAG CT AAA T CGAACAGTGA 294 0 

CACAACGCCG GCAATGCCAT TGGATACTTG TGGTGCAATG TCACAGGGTA TGATAGGCTA 3 000 

TTGGTTGGAA ACTGAAATCA ATCGCATTTT AACTGAAATG AATAGTGATA GAACTGTAGG 3 06 0 

CACAATCGTT ACACGTGTGG AAGTAGATAA AGATGATCCA CGATTTGATa ACCCAACTAA 3120 

AcCAaTTGGT C C TT TTT AT A CGAAAGAAGA AGTTGAAGAA TTACAAAAAG AACAGCCAGA 3180 

CTCAGTCTTT aAAGAAGATG CAGGACGTGG TTATAGAAAA GTAGTTG CGT CACCACTACC 324 0 

TCaATCTATA CTAGAACACC AGTTAATTCG AACTTTAGCA GACGGTAAAA ATATTGTCAT 3300 

TGCATGCGGT GGTGGCGGTA TTC CAGTTAT AAAAAAAGAA AATACCTATG AAGGTGTTGA 3 360 

30 AGCGGTTATA GATAAAGATT TTGCTAGTGA GAAATTAGCA ACGCTGATTG AAGCAGATAC 34 2 0 

CTTAATGATT CTTACGAATG TAGAAAATGT ATTTATTAAC TTTAATGAAC CTAATCAACA 34 8 0 

ACAAATCGAT GATATTGATG TAGCAACACT GAAAAAAtAC GCGGCACAAG GTAAGTTTGT 3 54 0 

55 GGAAGGATCG tGTTGCCAAA AATAGAAGCT GCGtACgtTT GTTGAaAG t G GGGaAACCAA 3600 

A 3601 
(2) ."INFORMATION FOR SEQ ID NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
45 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 60 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 120 

AACTTGCGCA AGATGTTAAT GTTTCG CGTC GGACAATTGC AGATGATATT AAAATGATTC 180 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 300 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 3 60 

GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

75 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : linear 

20 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 24 0 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGG CCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 360 

35 ATAGAAATCG-CAAGTAAATC-GCTTGGAGTT-GGTTC 420 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACTACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACT AT CA ACTAAACCTT GTAACTCGTT TTCTTCAATG 720 

CCCATTTCCA ATGCACCCAT TG CTTTT AAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 840 

50 ACATTAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 900 

AAACGCCATT GTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 960 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT G CT T TGTTAA ATTCTGAAGT 1140 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

ACC CGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 24 0 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 3 00 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 3 60 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 4 20 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 54 0 

35 TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 600 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 73 0 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 840 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 10 20 

50 AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 1080 

CAATAAGAAA 1090 
(2) INFORMATION FOR SEQ ID NO: 10: 
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(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



10 


TTAGGACTAT 


TTTATCATAT 


TCATTTAAAT 


TACGG CTAAA 


AATTTTAAAA 


ACGGGGATTA 


60 




ATATATGGAA 


TTAAGCTATG 


AAAGTTAATT 


GATACTTGCA 


TTTTACGCTG 


ATTTATATAA 


120 


IS 


GAATAACTAT 


TGTATAGTTT 


TAAAAACGAA 


CGTACGTTTG 


CAGGAGGCGA 


AATCATTGGC 


180 


AATGAATAAA 


CAAAATAATT 


ATTCAGATGA 


TTCAATACAG 


GTTTTAGAGG 


GGTTAGAAGC 


240 




AGTTCGTAAA 


AGACCTGGTA 


TGTATATTGG 


ATCAACTGAT 


AAACGGGGAT 


TACATCATCT 


300 


20 


AGTATATGAA 


ATTGTCGATA 


ACTCCGTCGA 


TGAAGTATTG 


AATGGTTACG 


GTAACGAAAT 


360 




AGATGTAACA 


ATTAATAAAG 


ATGGTAGTAT 


TT CT AT AG AA 


GATAATGGAC 


GTGGTATGCC 


420 




AACAGGTATA 


CAT AAAT CAG 


GTAAACCGAC 


AGTCGAAGTT 


ATCTTTACTG 


TTTTACATGC 


480 


25 


AGGAGGTAAA 


TTTGGACAAG 


GCGGCTATAA 


AACTTCAGGT 


GGTCTTCACG 


GTGTTGGTGC 


540 




TTCAGTTGTA 


AATGCATTGA 


GTGAATGGCT 


TGAAGTTGAA 


ATCCATCGAG 


ATGGTAATAT 


600 




ATATCATCAA 


AGTTTTAAAA 


ACGGTGGTTC 


GCCATCTTCT 


GGTTTAGTGA 


AAAAAGGTAA 


660 


30 


AACTAAGAAA 


ACAGGTACCA 


AAGTAACATT 


TAAACCTGAT 


GACACAATTT 


TTAAAG CATC 


720 




TACATCATTT 


AATTTTGATG 


TTTTAAGTGA 


ACGACTACAA 


GAGTCTGCGT 


TCTTATTGAA 


780 


-35 


AAATTTAAAA 


AT AACG CTT A 


ATGATTTACG 


CnwGGgTAAA 


GAGCGTCAAG 


AGCATTACCA 


840 




TTATGAAGAA 


GGGAt CaAAG 


rGTTgTTAGT 


atGTCCAaTG 


ArGGAAAAGA 


AGTTTTGCCT 


900 




GACG 












904 


An 


(2) INFORMATION FOR SEQ ID NO: 11: 








45 


(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 11271 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 








SO 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 






GATTTCTAAA 


TCAAGATCTG 


TTTTACGATA 


ACCATTCAAA 


CCTTGACGTT 


CATCTTCTTC 


60 




AGGTTGATTT 


TGTTGCTGTG 


TGTCTTTGTT 


GTCAGAAGTC 


GCTACTGTTT 


TTTTATTATC 


120 


55 


TGTTTCTTTA 


GTCATAACAA 


ACGCCTCCGT 


TATAAAACGC 


TATATTTAAT 


GATATGTGAT 


180 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT ATTATTTGAC TATGTTGGAT TAGGCATCTA 300 

GTCCTATAAT ATCACTGACA TTGTCAAAAT GATGATCTTT TAAGTAACGT GCGATG CCTT 360 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT CAATAACAAG TGATGAATAA ATTTGAATAA 4 20 

GTGACGCACC GTGACGCATC ATTTTGATTG CATCTTCAGT ACTGAATACG CCGCCTGTAC 480 

CTATAATTAA AAATTCACCA TTTGTTTGCT GATAAgCATa CTTAATCAAT TTTAAATTAC 54 0 

GTTCAAATAA TGGACGACCA CTCAAACCGC CTT CTTCG AC TTTATTAGCA GAAGTTAAAC 600 

CATCTCGTTG TCGCGTTGTG TTTGCTAAGA TGATACCGTC AAATGTCTCA GTAATCGCTG 66 0 

15 GTAATAGTGC TTTTAAGCCA TCGAAATCCA TATCAGACGT TAGTTTTAAA TAAATTGGCA 720 

CTGTTACATC ATGTTGTTTT TTAAATGCTG TTAAAGCTTG GCATAACATT GAAAATTCAT 78 0 

CTTTATCATG GAAGTTTTGA AGATTTTCAG TATTTGGAGA ACTGATGTTG ACTGTGAAAA 84 0 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA CCTTTATATA ATCTTGATAA CGCGCTTCAT 900 
AAGGTGTCAT TTTATTCACA CCAACATTGA TACCAACAGG TACTTGATAA GCATTTTTAC 960 

GCAAATGACT TAGTGCTTTG TTCATACCAA TATTATTGAA GCCCATTCGA TTTATCAAGG 1020 

CGTCATCTTC TAATAATCTA AACATGCGTG GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 1080 

TGATACCACC TAATTCTAAA GCACCGAATC CAAGGTGTTC CAATGCTTTT GGTACTTCGC 114 0 

AAGATTTGTC GAAACCAGCT GCTAAgCCAA TTGGATTGTC GTACGTATTA CCTTGTATCG 120 0 

TTTGTGATAA CGTTGGATTC TTATAAGTAA ATAGTTTATC GACGACTGGG AATAAAACCG 1260 

GaAAC' TT TTG TaACGTTTTT AATGCATCGA TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 1320 

35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT ACATGAGTAT GCTCCTATTT CATTATATTT 13 80 

GAGGCTTACT ATCCTCAA.CT TAATATATGT GAAATATATT CTTTTAATAG ACT AG CATTT 144 0 

CCATACATAA TTTCCTAGTT AAAACTAAAA AGTTTTGAAA ATTGACGCAA gTTTGAATAA 1500 

CGTTTTTAAG ATTAAATCAT CCTAATTAGG CAATATTATA GTATAAAGTA AGTAGATTGG 1560 

AAGGTGTTTG TATGAATGAA CAATGGTTAG AG CATTT ACC TTTAAAAGAT ATTAAAGAGA 1620 

TTTCACCAGT GAGTGGTGGT GATGTAAACG AAGCATATCG AGT CGAAACA GATACGGATA 1680 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA AAGAATCATT TTATGCTGCA G AAATTG CAG 1*740 

GTTTAAATGA ATTTGAACGT GCAGGTATCA CGGCACCTAG AGTAATTGCA AGTGGCGAGG 18 00 

TTAACGGTGA TG CGTATTT A GTGATGACGT ATTTAGAAGA AGGGG CTTCA GGGAGTCAAC I8 60 

GCCAATTAGG G CAACTCGTA GCTCAATTAC ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 1920 

GCTTCTCATT ACCTTATGAA GGTGGCGATA TTTCTTTTGA TAATCATTGG CAAGACGATT 1980 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 210 0 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2220 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 228 0 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTT AT CGTT 234 0 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 24 0 0 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 24 60 

15 ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 252 0 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 258 0 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 264 0 

20 AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 27 00 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 2760 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 282 0 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 2880 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 294 0 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3000 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 306 0 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

35 GAGCARATGG - ATGACTTATA - TGATGAAAAG - CGAGAGGTTA~ 3180- 

GATGCGCTTA ATCATTCGAT AGATCAATTA TATCAACATT TAGGTG AG CG TTATTATAGT 3240 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 3 3 00 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 3360 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT G A CAAAGG AG 342 0 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 34 80 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3540 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3 6 00 

so ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 366 0 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3720 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3780 
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ATGTCATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3 900 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTTTTTCT TTTAATATTG 3 960 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4 020 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT G CAACAT CAT AAAGTGG CTA 4 080 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 414 0 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4 2 00 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4 260 

,5 CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA 4 32 0 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 43 80 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 4 44 0 

TATATCACGC GG ATT AT AT C TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4 500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4 560 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTG C ATATCAACTT GCAAAAAATG 4 620 

GCTCTGACAT CGCACTTTAT ACTAGCACAA C CGG TTT AAA TGATCCGGAT GCTGATCCTA 4 6 80 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTG CTC 4 74 0 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4 8 00 

ATATCAGTTT TGATAG CGG A CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 4 860 

GCTTTGATGC AACAAAAAAT CCAATCGTTC AACAATTATT TGTGACAACA AATCAAGATA 4 920 

35 TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 4980 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 5040 

TAC&GCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC TAAACAAGAT GTCATTGAAA 5100 

ATTATCAAAA AAATCAAATG TATTTAGATG ATT ATT CATG TTGTGAAGTG TCATGCACAT 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATT AGT CATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 5280 

ATTAATCATA AAACAGAAAT T AG AT ATTT C ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 534 0 

AGCGAGTGAA CATGTGATAG AACAATTG A C ACT ATTT CAA CATGAGCGAC GACATTTAAG 5400 

so ACCTAAAATA AGTGCGACAT TTTTAGCCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 5460 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTATGTGT CAG CATGG AT 5520 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACGATATA TTACAGCATA TATTATTTGG 5580 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 57 00 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 576 0 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG 5820 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 5880 

TG CG AC ATT A TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 5 94 0 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA €00 0 

CTTAGCGTTC CGCTAC CTT A TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 606 0 

CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CG CT AC CT AT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 618 0 

TTTACACCTA TATTATCGTT ATT AGG CATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 624 0 

20 TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 63 0 0 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 636 0 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 642 0 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 8 0 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 654 0 

AGGAGATCTA TCTTGGAATA TG CT ATT CAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 66 6 0 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 672 0 

35 GAGTGTTGAT~GGTGCCTTAG"TATCTGAAGA— CATACCTTTA— AGTGAACTGA— GAGAGGGGTG 6-780- 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 684 0 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6900 

40 TGCTT G TTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6 960 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7020 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7080 
AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 714 0 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 7200 
CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 7260 
ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 7320 
GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 73 80 
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CGGAAAGGGT 


ATTTCAAAAG 


AAGACTATCA 


ATGTTTGGAA 


CAGTAGTGTT 


TTCAGTGGAA 


7500 




GAGAATGGTT 


AACATG C CTT 


CATGTATAAT 


AACGAGTTGA 


TTTGAACGTT 


TAAGCGTAAA 


7560 


5 


TAAAAATAAG 


CTTGGTCAGC 


CATCAAATAT 


AATTTGAAAA 


CTGTCCAAGC 


TGTTTTATTA 


7620 




GAGAACAATC 


AATTAACCCC 


ACATATTTAA 


TAATACATCA 


GCAAAGCCTT 


CAGGTTTTTG 


7680 


10 


AATATAACCT 


AAGTGACCGC 


CTGGAATATC 


TACAATAGGT 


ATGCCAGTTT 


CTTTATTTAT 


7740 




ATAAAAGTTA 


ACATCTTGTG 


GGAAGGAGCC 


TCTAGAATCT 


GTCCCATTTA 


GTAGGG TG AT 


7800 




TTTATCG CTG 


TATTTTGTGA 


AATCATCCAA 


AGTAATATCT 


GAATGCGTAT 


ATTGTCTAAT 


7860 


15 


TTCAAATTCT 


G AC CAGAACA 


TCGTACGTTT 


GTACTGTTCT 


ATACGTCCTT 


CTTCAGTATC 


7920 




AGCAGGTTGA 


GACATCATTT 


TTGCATCAAT 


TGGTGCGATA 


TTTAATGTTT 


CGCCAAATGT 


7980 




TTTCATGCCT 


TTTTCTAAGC 


CTTCTGTTAA 


AATTTGATGC 


ACAATGTCAT 


CATTTTTATC 


8040 


20 


TTTCCAATAA 


GTACTGTCTG 


GTAAAAATGT 


ATTAATTGGT 


GGTTCGTGAA 


ATGCAATCTT 


8100 




TTTAACGACT 


TCAGGGTAAT 


CTTTTAACAC 


ATGCATCGCA 


ACGATTGAAC 


CTGAACTTGA 


8160 


25 


ACCTAATATA 


TAGACAGGTT 


CATCACTTAA 


TGACTTTGCA 


AGTTCGGCAA 


TGTCCTGTGC 


8220 


GTCGCGTTTG 


ACACGATAAT 


CACTGTCAGG 


GTTTGAAGCG 


GAATCAGGGA 


GTGGTT CAGT 


8280 




TAACTCG CTT 


TCTCCATAAT 


CACGACGATC 


AACGGCTACA 


ACAGTAAAAT 


GGTCTTTTAA 


8340 


30 


CTGTTCTGCA 


AGAGGCAGAA 


AAATGTCTCC 


GGTACCGTTT 


GCACCAGGAA 


TAAAGATGAG 


8400 




CACGGGTCCT 


TGTCCGACTT 


GGTGGTATCG 


TAATTTAGCG 


CCTTGTAATT 


CTAAAGTTTC 


8460 




CATATTCAAT 


G AC CTCCATT 


TGTTAATTGT 


TAGGTGATAA 


ACCTAATAAT 


TTAGCACCAT 


8520 


35 


TTGTATAACT 


TATTTTCTCT 


TTTTCTTCAT 


CTGTTAAACC 


CAGTTCATCT 


AAAAATACAC 


8580 




CTAATTTTTC 


AGGCTCAATA 


TATGGATAAT 


CAGCAGCATA 


AAGAATTCTA 


TCAATACCTA 


8640 




CTTCTTTCTT 


G ACT AAAT CA 


AACTGTGGCT 


TCGTTAACAT 


GCCACTCGGT 


GTGATATAAA 


8700 




AATTATTITT 


AAAGTAATAG 


CTTACAGGGT 


GGTTCAAATG 


TTCAGCGAAT 


AAAGCTTCAT 


8760 




CCATACGTTC 


TAAGAAGAAT 


GGGATAAACT 


tACCCCAATG 


TCCAATAATC 


ATATTTAACT 


8820 


45 


TTGGATAACG 


ATCAAAAATA 


CCAGATAATA 


CTAGATGTAT 


TGTATGAATG 


CCGACATCAA 


8880 


TGTGCCAACC 


ATAACCAAAA 


CAAGCAAATG 


TTGCCGCAGT 


TACTTCAGGA 


TAATTTCCTT 


8940 




TATAGTATGA 


TTGATAAATG 


TCACTGTTAA 


CTGGCGCGGG 


ATGTAGATAA 


ATCGGTACGT 


9000 


SO 


CTAAATTTTC 


AG CTG TTTTG 


AAAATAATGT 


CATATTTGTC 


TTGATCAAGA 


AAACCATCTT 


9060 




GTGCACGTCC 


CATAATGAGC 


GCACCTTTGA 


ATCCTAAATC 


ATTGATGCAA 


CGTTCGAATT 


9120 




CTCGCGCTGC 


GGCTTCAGGC 


TCATTGATAG 


GTAAAGTTGC 


AAAGCCTACA 


AAGCGATTGG 


9180 
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TCTGACCAAC 
CTTGATTATT 
AACCTGTCTT 
CTGATATCGC 
TAATACTTTT 
AAGCATTTTT 
ATCTCTCCAT 
TGAATGATTG 
AAATCAGTAT 
GCGTATTGTA 
ACTTTAGATT 
ACGTTGACGA 
CGTACGATAC 
GAAGGTGCTG 
TCTTTTTCAG 
TGTACATAAT 
CCTATATAAA 
TTATTGCCTC 
-GTTTAATAAA- 
AAATACACTA 
TCATCGCTTT 
TAAAAACGTA 
TTAAATTGTA 
TCTTTAATTT 
CATAAATCGC 
TTAGCAATAC 
ATTTCTTCAA 
AATTCTCTAA 
TAGCAATGAA 



CAAATTTGAA 

CATAAATTGG 

TTTTTcAAGG 

ATTCATCGTT 

CATTTACTTC 

TAT AT AC CT C 

CGCCATTAAT 

GCTTGATTCC 

TTGTAGCGCC 

GCGTTAGCAT 

CAGCTGTTTC 

CAACAGGTTG 

CGAATACATT 

AAAATTGACC 

CAATCATGTT 

GAACACCTAA 

CTTTGTAACC 

CTGTGATTAA 

TAATTTCTGT" 

GTTGATAATG 

TAAATAAGTC 

ATTTATTGTT 

AAAGATAATC 

GCTTAAAATG 

TTTTATCTTT 

GATTCATACT 

TTCGTTGCCT 

AATTGACAAA 

TTTGCAATAA 



GGAGAACCAT 

ATACGTTCAT 

C CTTCTAAC A 

TCTTTTTGAA 

GCCTCCATAT 

TAATTTCAAT 

AAATGT AC C T 

TTCAGAAACG 

TGGATCAGCA 

TGTTACTGCC 

GGGGTTTGTA 

TTCAGATTTT 

CGTTTGATAT 

AGATATACCT 

ATAAGCATTT 

TTTTTGTGAT 

CAATG CTTT A 

CACAATTTTA 

ACGCTTCAAT- 

ATTGCAACAG 

ATAATAAAAA 

GCCTGCTTCA 

TAACGACACT 

AGTGATAAAA 

GAAATGTCGA 

AATGTTTTCA 

ATTCTCTTTT 

CGGATAACTC 

CTATTAAATA 



TTCCATAAGA 

CATGATGTGA 

TTACTTTCAT 

TAT CTTCAAT 

TGT ATTG CAT 

GTTTGTAACA 

GTTGGGCCAT 

TGTTTGGAAT 

GCATTGATTT 

GATTTAGACG 

ACCATTCCAA 

TCTAAGAGAG 

ACTTCTTCAA 

GCATTGTTAA 

TTGACTGAGT 

GCTTGTTGTC 

AGTGCCTCTG 

GTCATTACGT 

TGAAATATGG- 

CATATCTGTT 

TCAAATAATT 

AT ACATTG CA 

TGCATAACCT 

ATATCAAGGT 

TACAATGTCC 

ACGCCTTCTT 

GCATCTTTTC 

TCCGTTTATT 

TCATAAAAGA 



TAAGACTTGA 

TAATTCGTCG 

CGGTACACCT 

GACATAATGT 

GTTTATTGCA 

TAAAATTGAT 

CTGCACCAAT 

T ATT A CT AAA 

GCATGTTAGG 

AACAATAAGC 

ATGAACCTAA 

GGACGAATGT 

CGTCACGAGG 

TGAGGATATC 

AGTCACTTGT 

CTCTTACATC 

CACTTGCATA 

CCCACCTCAT 

CGATGCTCTA- 

TTGAAtTCGT 

CTTGATAAAA 

GTAGTGCCTT 

CATAATTAGA 

CTCTTTGTAT 

CCATACCGAT 

CATCAAAAAG 

GCATGATTAC 

ATAAAACGTG 

AAAGAGTGTT 



ATTTGAACGT 

GCATTTGTAA 

TTAGGATCTG 

TCTTCAAACG 

TCTATTGCAG 

CT AC C AAGGC 

CGTTG CTAAT 

ATCACCAACT 

TAATCCTTTA 

TAATGAATTC 

ACCACTTGAT 

ATTCATCATT 

TGTCAATTTG 

AAGACGGCCT 

AA CAT CTAAT 

ATTCCGAGAA 

GCCTAACCCT 

CTAAATAAAT 

TTTGGAAGGC- 

GTAAGTGTCG 

TGCGCTTTGG 

ATTATCGATT 

ATGATAGTCA 

GACGTAGTAG 

ACCTAGTTCT 

TTTGTGCGCT 

ACCTACTTAA 

TTAAGAAAGT 

GATAATGTCT 



9300 
9360 
9420 
9480 
954.0 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
-10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11100 

GT ATT AG CT A TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 11220 

GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 11271 



(2) INFORMATION FOR SEQ ID NO: 12: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDEDNESS : double 
, 5 (D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

20 CAACCCGTTC AGAACAAAAT AAAAACCGTA CAATTTTATC ATCTTAATGA TTATTGTACG 6 0 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAATCGA TATCGGTAAA TTTATTATAT 12 0 

TGTTTCATAA AATGTAACTT AACTGTGCCT GTTGG AC CGT TACGTTG CTT AGCAATGATA 18 0 

25 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 24 0 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 30 0 

TCTTGCTCAA TCGAACCAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 36 0 

30 

TGTTCAACAC CACGAGATAA CTGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 42 0 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTC CTGTT GTCTGTTATC GGACGCACGT 48 0 

35 GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 54 0 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 600 

ATAAAAATCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 660 

40 

TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 720 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 780 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 84 0 

45 

CGCGCTGCAA GGATAATTAA ATCATTTCGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATC CTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 960 

50 TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 1020 

GATAGCTCTA AAATTCGACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 1080 

TTATATC CAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 1140 

55 



3NSOOCIO: «£P. 0786519A2_I_> 



250 



EP0 786 519 A2 

TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 1260 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACGGAAA 1320 

5 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 13 80 

ATCAATT CTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 144 0 

GGCATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 150 0 

10 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 156 0 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 162 0 

15 AATTTTAATA T CATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 1680 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 174 0 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

on . 

CTGTAACTCT AATTGTTTAA GGTTACCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 1860 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 192 0 

TTTACCTTTA ACAT CTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

25 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 2 04 0 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

3o ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2 22 0 

GGATGATAAA TTTTATCGTC TGAACCATGC G C AATGGCT A TGCCATTATC TTCAACTTTT 228 0 

35 ACAGTTCGAA TTAATTCAGA T CGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 234.0. 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 24 00 

GAT CCTGTT C GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 2460 

40 

GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2520 

TCAGCTGTCG AACTTG CGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 2 580 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 264 0 

45 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 2700 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2760 

50 TCGTTCATCA CGCGTCGTAA TGTTGG AT CA ATGTCAGTCT CATTTAATAC GATGTATGCT 2 820 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 2 880 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 2 94 0 
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CCATAGAAAC GCACATTACC ATTAATACTT TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 3 060 

AATGCTAAGT CTAGGCCTGA TTGTGATAAT TCACCTAAGT CGATTAAATT TTCAGTACCT 3120 

TCACCAACAC CGATACTTAA TGTTAATTGG GCACGATAAC CAACACTTTT TTCACGTAAT 3180 

TGACTCAAGA TATCAAATTT AGATTCTTCT AAGTCAGCTA ATATTTTTTG ATTTAAATAG 324 0 

GCTACGAATT GATCGGAACT GTATCTTTTG AAAAATATAT TATACTCAGT TGCCCATCGA 3 3 00 

CTAATGACAC GCGTTACCAT TGAGTTGATT TCCGAACGCT GCGTATCATT CATATTTTG C 3 36 0 

GTAATCTCAT CGTAGTTATC TAAAAATAAT GTCGCAATGA TTGGTTTAGA ATTTTCATAT 34 20 

15 AGTTCATTTG TTTGTACTTG TTCAGTTATA TCAAAGAAAT AG AGG CAGTG ATCATTCTCA 34 80 

GAATAACGTA CTTGGAAATG ATACTGATTA TATTCTATTT cAACGGATTT CACTCTATCT 3 54 0 

AATTGCTTTA AAATGTTTGG AAATACTTCA TTTACAGATT CAGAAATGAC ATTCGCTTCC 3 600 

20 ATATGATCTG TCATAAATTG GTTAACCCAT TCGATGTGAT CATTTTCATC TAAAACAATG 3 660 

AT AC CAATTG GTAAATGTTT GATTGCTTTA TTATTTGTTG TTGAAATTTG AGCACTCAAA 3720 

C CAT CT A CAT AACT ATC CAT TTTCATTAAA GCTTGTCTGA ATAAAATGAT GCTAACAATA 3 780 

ATCATCACGA CAAGAACGAT AG ATG CAATT AGTGCTATAA G A CT ATT AAA GATAAACCAT 3 84 0 

ACACCCATTA AAA CAATTG C TGTGATGATC ATGATGACAA ATGGTATTAG TAAAGCTTTC 3 900 

TTAGTGGACT GCCGATTCAT T ATT C CAC CT CTATTCACTT TTTAGAATTA TTTTTCATGA 3 960 

TTCGCTTCAA ATTCAAACTT AAATCGATAA CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 02 0 

GTGT CAGTAT TGTACCGATA ACCAATAGTA AAATCGTTAC TGCATTCGGC AAACCTTTCG 4 0 80 

CTTTACCAAA GAAATGAATA ACACTTAAAC CTTGAATATA CATTACTAAT GATAACACAA 414 0 

GTTGGAAGTT TAAAAGAATG CTCTGGAACA CACTCGGTTG ACCTGTAAAT AATAAACATA 4200 

TGATAACAAT AATGT AT AT C CATAATAAAA TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4260 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC GTAAAATCGG AAATGTAACG ATTAAGTTAA 4 320 

TTAAGACGAT TAAAAATGTA ATGATAATGA TGAAACCTGG TAATTGAACG GTCGCTTGTC 4380 

TAAACCCTTC TTCTAATATT TGGGTCATAT TCGCATCGGC ACCGCTCATC GTAATCGCTT 444 0 

CATGTAATGT TTGCTTGAAA GGTTTTACTA TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 500 

TTTGT AG TAA CATAAAAGCG ATTAATGAAA TTnArCTCAT CGCTACTGTT GTTACGTATA 4 560 

ATATTCTTTC TTTAGACGTT CTTTCTTTGA GCAATTGACC AATAATTAAA CTTGCAATTA 4 620 

AGACTAATAT GATGGCACTT AAAACGAAAG T ATT AC CTAA AACAGTTGTT ATAATTACTG 4 680 

TAATAAGTGC ACTAATCCCG AAAGATTGTA TTGATTTATT CCATAAAACG ATACCTGGTA 4740 
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CAAATACCAA CGCAATCGTT GCAATTATTG TTGCTTTAGG TTGTATTTTT GAAAACACAT 4 8 60 

AAGCCACTCC CATATTTTTA ACT AT AG CT A TTATTTTAAC CTCTTTAATG AAAATTAACA 4 920 

ATTTATAGAT TGTATGCTTC TATTTCATTT AATTGAATAA TAACTTTCAT GTTTTATAAG 4 980 

TAATTAACAT ACTCATTTGA ATCGCTTTTG TGTGCTTTCA TTTTCAACAT GATTATTTAA 504 0 

TCCCACTACA TAGCAATCAA GCTTGATTTA GATTTACAAT ACATTTCCAC TCTCATGTAC 5100 

TCTAGATGTT TTTGAATATG ATAACTGTGA TTTAGTGGCT TCATTCTTTG AAAATATATA 5160 

TTATTACTTA CGCTTAAAAT GCTTTAAATT TAAGAAATGA TATAAGTTAG GTGCCCAGGT 5220 

ACTAAAGTTT AGTAGGaATC CATCATGCCC AACATTATCA GGCACGAAGA AATGACGATG 52 80 

ATATTTAAAA CGTTCACCTA ATGCACGAAC TTGATCATCC GGATATAGCA AATCATCTAT 534 0 

GAACCCCATC GTTAACACTT TTGTTTCTAA ATTTTTAAAA ACATGCGTTA CGTCTGTGCG 54 00 

20 ACCTCGGTCA ATGTTGTGAC TATCCAATAC ATCTAGCAGT GTCAGATAAC AATTCAAATC 54 60 

AAAATGTTCT TTAAATTTAT TACCTTGATG TTGTTGGTAT GCGACTACTT CATCCGGCGT 5520 

AAAACGTTCA TCATAACTTT TTGATGATCG ATATGTCAAA AAACCTAATT GGCGTGCAAT 55 8 0 

ACTTAGACCT TCCTTACCAC CAAGATGAAT GGCTTGCCTT GCAATTTCAT TGAAAGCTCT 564 0 

ACTATAAGAT GATGTTCGAC TTGTTGCAGC AAGGATAATG GCTTTATCTA CTTCAAACTG 57 00 

TTGATTGTAG AGTAGTTCCA TTGCTTGCAT ACCTCCAAGA CTTCCCCCTA TTAAAATATT 5760 

AATCTTATCA TAACCAAGGG CTTGTATACC TCGTTCATTC GCTCTGACTA TATCTCTTAA 5820 

TGTTAATTTT TTAGGAAAAT GAGGGTCGTT TAAAGGTGAA CTTGAACCGA AAGGACTACC 58 80 

35 AATAACATCA AATGTTAAAA ATTGATAATC GTGAATGGGT ATATATCCCC CAT CAAT AAT 594 0 

TTCTCGCCAC CAACCCGGAT AATCATCTGT TCCATATGTT AAATGATTGC CAGTTAATGC ^ 6000 

ATGACAAACT ACAACTAATG GTTGTC CATG ATAACCGACA TGCTCATATC TCAAACGCAA 6 060 

40 GTnATCTATG ACTTCCCCAG ATTCTGTAAT AAATTCCCCT AAATTTAAAG TATCTACTGT 6120 

GTAATTTGTC ATTGTTCTTT CCTCCTTAAA CAAAAAAACT TCTCACCCTA TTGAAAAGTA 6180 

AGAAGTCTTT ATACTTATCA TTCGAGTAAC TCGTTGGTTT TAGCACCGTG CTATAAAGTC 6240 

45 GGTTGCTGAA GTATCACAGG G 62 61 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 1222 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGCGATTAA CTCTGGAAAT ATCTTTTCCA TATTTACGTn TTAAATTATT CAGCAAATTC 60 

AT ACG AG a TT CATACTCGTT yAACACTTGT TCGTCGAATT CTGTATTAGC CATTTCATCA 120 

TATAACTCAT GTTTTGCATC TTCTAAAATG TAGTAAAATT GATCAATATC TTCTTTTAAT 18 0 

TTGTCATATT TGTTTGGAAC TATATCGTTT ATTGTTAACA AATGGTTGCT TAGTTCATAT 24 0 

AAACGATCAG TG AT AG CATT TTCATCCGTT AATGTCATAT ATGCGTTATT AAGCGCTAAG 3 00 

CTTAATTTTT CAGAGTTTTG AATGCGTTTA AT AT CT ATTT CAAGTTG CTC TATTTCGCCT 3 60 

T CTTTT AG AT GTGCTTCAGA CAATTCTTCT AATTGGAATT TCATTAAATC TAAACGCTGT 420 

AGCAATGCTT GGTCTGCTGA TTCTAAATCT TCTAACTCTT GCTTTTTGGC TTTATAATTT 4 80 

TGAAAAGTTT GGTGATATTT ATCCAACAAA TCTTGATAAC GTGATTCTGC GTAATTATCC 54 0 

20 AATAATGTTA AATGGTATTT TTGTTTCAAC AAAGACTGCG TTTCATGTTG GCCATGAATA 6 00 

TCTAATAATT CTTGCATAAC TTTT CGTAAA TCTTGTAAAG TAACTGTTTG ATTATTAATT 660 

TTACAAAGAC TTTTACCAGA GCTGAAAATT TCCCGTTTAA CTAATAAAAA ATCTTCATCT 72 0 

ACATCAATAT CCATATTTTT CAATATATGT ATAGCATCTT TACTCTCGTC AAT AT CAAAT 780 

ATACCTTCGA TGACAGCCTT TTTTTCACCA TGTCTTACAA AATCAGATGA AGCTCTCATT 84 0 

CCAATTAATT GTCCAATTGC AT CT AT AAT A ATTGACTTAC CTGAAC CCGT TTCACCACTT 9 00 

AAAACAGTTA AACCATCAGA AAATTGAATT TCTAATTCTT CAATAATAGC AAATTGCTTG 960 

ATTGATAAGG TTTGTAACAT AAACTCATCG CATCCTTATA ACAAATTGAA AATTCTTGAC 1020 

TTGATTTCAT CACTTGCCTC TTTGCTTCGA CAAATAATTA AACAAGTATC AT CAC CACAA 108 0 

ATTGTG CCTA GTACTTCTTC CCAATTGATT TGGTCTAATA TAGCTCCAAT AGATTGTGCA 114 0 

TTACeAGGTA TGTTTTTAGA ACAAGTAAAT TATCAGTACC ATCTATATTA ACAAAGGAAT 1200 

40 CCATTAAATA ACGTCCCAAT TT 1222 
(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 60 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 24 0 

TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 300 

GAT ACT ATG C AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 360 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 4 20 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAG CATCTGC ATCTTTAATA TTGTT AT AG C TTAATTGTTT 54 0 

CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTG CTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 660 

TACAATCGCT TCATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 720 

20 ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 780 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 84 0 

ACCAATATCA ATATTTTCAA TTG CTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AG CTAAAATG CCACCATGAA CAGCCGGATG 1020 

T 1021 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3759 base pairs 

(B) TYPE: nucleic acid 
(O STRANDEDNESS": double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 60 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 180 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 240 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 300 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 360 

CTATCATTAT ATTGATTATC TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 
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ATAAAAtAGa ATTCyCCAGG kTTTACtTTA AtatATCyAA gTAtCGaCtC tATCGTTCCG 540 

TGTTGAACAT GATTCGCAAC TTCTTCTCTA GACTCTGCTA ATGTCCCtAT AACTATTTCT 600 

GCATCTTCTT CTGCATCTAT AATATACCAA CATTCAGATT TGCCATATTG CCCgTTTTCA 660 

TGCTCATAAG CATAAGAATT ATCAGGGTGC ACATGAATAG AAAGTGATTC TCTTGCATCC 720 

ACTATTTTAG TTAGAAGCGG AAAATCTTTG CTTGGGAAAT CACCAAACAA TTCACGATGT 7 80 

TCTGACCAAA TACGGTCTAA TGTTTGACCT TGATATGGTC CATTAATAAT CTCGCTCGTA 84 0 

CCATTTGGAT GTGCTGACAC ACACCAACAT TCCCCCAGTT GTATCATTGT CTAATTGATA 9 00 

TCCAAACTCA CTTAGACGTT GACCGCCCCA TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

TAATGGCATT GTTGCACCTC CATTGTGATT AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 1020 

T C CATT AT AT TTTGATTTTG TTCTCATTTA CATCGTATTA TTAACTTCCA CATTTCAAAT 10 80 

20 TAACTATTAG TGATTGTACC ATATTTACTA ACATTGCAGT ACTGCCAATT AAAAGnGCTT 114 0 

CACTTAAATT TACAGTACTT TAACATTTTC AAAAATTTAT AGCATAGAGA TT AT AT CT CT 1200 

CTTACATTTG TACATATTTC CCTTTAAATT TACTCGCCCA TT AT AC CAAT TAATAaACAA 1260 

CTTTAATAGT TGTGCCATAC ATTGTTCAAA TTCTTTGTAA AACGCATAGA CAATACGTAC 1320 

TTATTCATAC TTATAATTCA TCATTTTCAA AAAATAACGA GTTACGAAAA AGTAACCCGC 13 80 

TTCAAATCAT ATTTACTATC CTTATTAATC CGTTTCATTT TCAAATTGAG TTAAAGCATC 144 0 

TTTAATGTCC TGATCACCAC TAATAATTTG AAACTCTTGG TGATTAAAAT GATTGGATGT 1500 

GACAATTTCT TTTAATACTG TCGCAACATC TTCTCTAGGA ATTTCACCTT TACCATCAAA 156 0 

ATATTGTGCA GCTTCTATCT TTCCAGATCC TGCTGCATTT GTAAGTGCCC CTGGATGTAA 1620 

AATTGTATAA TTCAAACCTG nAACGTCTTA AATAGTCATC AGCGTAATGT TTAGCTATTG 1680 

TATATGGCTT TAAATCACCG CTATCATCAA AAGCCTGACG TCTCGAATCA TATGTTGAAA 1740 

40 CCATGACATA GTGTTTAATA TTGGCCTCTT TACTCGCAAT CATTGATTTA ACAGCACCAT 1800 

CTAAATCGAC AATAATTGTT TTATCTGCAC CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 186 0 

TAACTTTATC GAATGGTTTA AACGTCTCAG TTAAAGTCTC TATTGAATCA TTTTCAACAT 1920 

CAACAAGAAT TG CTTTCAT A CCTTGTGATT TTAACGCATT AAGTTGATCT GATTGCCTAA 1980 

CACCAGCAGT AAATGGTACA TTTTCTTTTG CTAATTGTTG CACTAGTAAC GAACCTACAC 204 0 

CGCCATTAGC AC CT AT AAC C AAAATATTCA TTTACAACAC TCTCCTATkT ATTATTCTCT 2100 

ATG CCATACC ACTTTATGAG ATATGTAAAA CTTGTTACAA CTATAAAAAT CAATTGACAT 2160 

ACTACTGGGA ACGTATTAAA TTAATATATG AACAAATATT CATATGAAAG GATTGTCATA 2220 
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tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 234 0 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 24 00 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 2460 

TT CATT AG AT GACATCCACG TAG C AACT AT GTTAAAGCAA GCCATACATC ACG CGAATCA 2520 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCAT CATG AC CATATGCATA 2580 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 264 0 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

r5 GTATCCATAT GTTTAG CG AC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTT CGAA GTACTCGCAG 282 0 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAG CGATTA 2880 

20 AACGTTT CTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 294 0 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGG CGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 3060 

CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGC CGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGaAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 324 0 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 3 3 00 

TGAATGCATT GAGTTG TCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3 360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 3420 

35 



25 



30 



AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA C CAT CATG CT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 354 0 

40 CCATTGCATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 3600 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3660 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3720 

45 ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 3759 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TAATTATCGC GCATAACAAA ACATTAGCAG GACAATTATA TAGTGAGTTT AAAGAATTTT 60 

TTCCTGAAAA CAGGGTGGAA TACTTTGTAA GTtACTATGA TTATTATCAn CCAGAGGCAT 120 

ACGTACCGTC TACTGACACT TTTATTGAAA nAGATGCCTC AATCAnTGAT GAAATTGATC 180 

AACTACGACA TTCTGCTACA AGTGCATTAT TTGAACGCGA TGATGTAATT ATTATTGCTA 24 0 

GTGTAAGTTG TATATATGGT TTAGGTAATC CTGAAGAATA TAAAGATTTA GTAGTAAGTG 3 00 

TTCGAGTTGG TATGGAAATG GATAGAAGTG AATTACTTAG AAAACTTGTc AG ATGTG CAA 3 60 

TATACACGAA ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG TGATGTAGTG 420 

GAAATATTCC CAGCCTCTAA AGAAGAACTT TGTATAAGGG TTGAGTTTTT CGGCGATGAG 4 80 

ATTGACCGTA TCCGAGAAGT TAACTACCTA ACAGGTGAAG TGTTGAAAGA AAGAGAACAT 54 0 

TTTGCGATAT TCCCAGCTTC TCACTTCGTA ACACGTGAAG AAAAGTTGAA AGTTGCGATT 600 

GAACGTATTG AAAAAGAATT GGAAGAACGA TTGAAAGAAT TACGAGATGA GAATAAATTA 66 0 

CTAGAAGCGC AAAGGTTAGA ACAGCGTACC AACTATGATT TAGAAATGAT GCGAGAGATG 720 

25 GGATTCTGTT CAGGAATTGA AAACTATTCC GTACATTTAA CTTTGCGACC ACTGGGTTCG 780 

ACACCATATA CTTTATTGGA TTACTTTGGC GATGATTGGT TAGTAATGAT TGATGAATCA 84 0 

CATGTGACAT TACCGCAAGT TCGAGGCATG TATAACGGAG ACAGAGCGCG TAAACAAGTT 900 

TTGGTGGATC ATGGGTTTAG ATTACCGAGT G CATT AG AT A ACCGTCCACT TAAATTTGAA 96 0 

GAATTTGAAG mAAAGACAAA ACAACTTGTG TATGTATCTG CAACGCCTGG ACCATACGAA 1020 

ATTGAACATA CGGATAAGAT GGTTGAACAA ATTATTCGTC CTACTGGTTT ACTGGATCCT 108 0 

AAGATTGAGG TTAGACCTAC TGAAAATCAA ATTGACGATT TATTAAGTGA AATTCAAACA 1140 

AGAGTgAGCG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG AGTGAAGATT 1200 

aACCACATAC ATGAAAGAaG CGGGTATTAA aGTtAATTAT CTGCATTCAG AAATCAAGAC 1260 

ATTAGAACGA ATTGAAATAA TTAGAGACTT ACGAATGGGT ACATATGATG TTATCGTAGG 1320 

TATTAATTTA TTAAGAGAGG GTATTGATAT ACCAGAAGTT TCTCTAGTTG T CATATT AG A 1380 

45 TGCAGATAAA GAAGGGTTTT TACGTTCTAA CCGCTCATTA ATTCAAaCAA TAGGTAG Ag C 1440 

TGCG CGTAAC GATAAaGGTG AAGTCATTAT GTATGCCGAT AAAATGACTG ATT CGATGAA 1500 

GTATGCAATT GATGAGACAC AACGTCGTCG AGAAATACAG ATGAAACATA ATGAAAAACA 1560 

SO TGGTATTACA CCTAAAACAA TTAATAAAAA AATACATGAT TTAATTAGTG CTACTGTTGA 1620 

AAATGACGAA AATAATGACA AAGCACAAAC TGTGATACCT AAGAAGATGA CGAAAAAAGA 1680 
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TTTCGAGAAA 


GCTACAGAAT 


TAAGAGATAT 


GTTATTTGAA 


TTAAAAGCAG 


AAGGGTGACA 


1800 




AGTAAATGAA 


AGAACCATCC 


ATAGTAGTAA 


AAGGTGCTCG 


TGCGCATAAC 


TTGAAAGATA 


1860 


5 


TTGATATCGA 


ACTACCTAAA 


AaTAAATTAA 


TTGTTATGAC 


AGGTTT AT CT 


GGGTCAGGTA 


1920 




AATCGTCATT 


AGCATTCGAT 


ACTATATATG 


CTGAAGGACA 


ACGACGTTAT 


GTTGAATCAT 


1980 


10 


TAAGTGCCTA 


TGCGCGTCAA 


TTTTTAGGCC 


AAATGGACAA 


ACCAGATGTT 


GATACAATTG 


2040 


AAGGATTATC 


GCCAGCAATT 


TCAATAGATC 


AAAAAACAAC 


AAGTAAAAAT 


CCAAGATCAA 


2100 




CTGTAGCAAC 


AGTAACAGAA 


ATATATGATT 


ATATACGTTT 


GTTATATGCA 


CGTGTTGGTA 


2160 


15 


AACCTTACTG 


TCCAAATCAC 


AATATAGAAA 


TTGAATCGCA 


AACAGTACAA 


CAAATGGTTG 


2220 




ACCGCATTAT 


GGAATTAGAG 


GCACGTACAA 


AGATTCAATT 


ATTAGCACCT 


GTCATCGCTC 


2280 




ATCGTAAAGG 


TAGTCATGAA 


AAGCTAATCG 


AAGATATTGG 


TAAAAAAGGT 


TATGTACGTT 


2340 


20 


TAAGAATCGA 


TGGCGAAATT 


GTTGATGTAA 


ATGATGTACC 


TACTTTAGAT 


AAGAACAAGA 


2400 




ATCATACAAT 


AGAAGTTGTT 


GTAGAC CGAT 


TAGTTGTTAA 


AGATGGAATT 


GAAACACGAC 


2460 




TAG CTG ACTC 


TATAGAAACT 


GCCTTAGAGC 


TTTCAGAAGG 


ACAATTAACA 


GTCGATGTCA 


2520 


25 


TTGACGGGGA 


AGACCTTAAG 


TTTTCAGAAA 


GCCATGCTTG 


TCCTATATGT 


GGATTTTCAA 


2580 




TCGGAGAGTT 


AGAACCAAGA 


ATGTTTAGCT 


TTAACAGTCC 


TTTTGGTGCT 


TGTCCGACAT 


2640 




GTGATGGCTT 


AGGCCAAAAG 


TTAACAGTCG 


ATGTAGACTT 


GGTTGTTPPP 




27 OO 


30 


AGACGCTAAA 


CGAAGGTGCA 


ATAGAACCTT 


GGATACCGAC 


GAGTTCTGAT 


TTTTATCCAA 


2760 




CATTGTTAAA 


ACGTGTTTGT 


GAAGTTTATA 


AAATCAATAT 


GGATAAACCT 


TTTAAAAAGT 


2820 


35 


TAACAGAACG 


TCAACGTGAT 


ATTTTATTGT 


ATGGTTCTGG 


TGACAAAGAA 


ATTGAATTTA 


2880 




CATTTACACA 


"ACGTCAAGGT" 


GGTACTAGAA 


AACGAACAAT 


GGTTTTCGAG 


GGTGTAGTTC 


2940 




CTAATATAAG 


TAGACGATTC 


CATGAATCTC 


CTTCAGAATA 


TACACGTGAA 


ATGATGAGTA 


3000 


40 


AATATATGAC 


TGAACTACCT 


TGCGAAACTT 


GTCATGGAAA 


GOGATTGAGT 


CGTGAAGCkT 


3060 




TATCTGTTTA 


TGTAGGTGGT 


TTAAATATTG 


GTGAAGTAGT 


CGAATATTCA 


ATCAGTCAAG 


3120 




CGCTGAACTA 


TTATAAAAAC 


ATTGATTTGT 


CAGAACAAGA 


TCAAGCGATT 


GCAAATCAAA 


3180 


45 


TATTGAAAGA 


AATTATTTCC 


CGACTCACTT 


TTTTAAATAA 


TGTGGGACTT 


GAATATTTAA 


3240 




CGTTAAACAG 


AGCTTCAGGT 


ACACTTTCAG 


GTGGTGAAGC 


ACAACGTATT 


CGATTAGCAA 


3300 




CGCAAATTGG 


GTCGCGTTTG 


ACTGGTGTCT 


TATATGTATT 


AGATGAGCCA 


TCAATTGGAC 


3360 


SO 


TG CAT CAAAG 


AGATAATGAT 


CGATTAATTA 


ATACACTTAA 


AGAAATGAGA 


GATTTAGGAA 


3420 




ATACTTTAAT 


TGTAGTTGAA 


CACGATGATG 


ATACAATGCG 


TGCGGCTGAT 


TACTTAGTGG 


3480 
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AGGTAATGAA AGATAAAAAA TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3600 

AAGTACCTGA ATATCGCAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GGAGCTAGAA 3660 

GCAACAATCT TAAAGGGGTT GATGTGGACA T AC CACTAT C AAT CATGACG GTTGTTACAG 3720 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 3780 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 384 0 

AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACG CCA CGCTCTAATC 390 0 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 396 0 

CTAAAATTCG AGG AT AT CAA AAAGGGCGTT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4 020 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 4 080 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT ATAATCGTGA GACACTAGAG GTTACTTACA 414 0 

AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 4200 

AAAATATTCC TAAGATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 42 60 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 4 3 20 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCGACAA 43 80 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 444 0 

ATGGTGATAC TGTTGTAATT ATTGAACATA ACCTAGATGT TATCAAAACA GCAGACTATA 4500 

30 TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4 56 0 

CCGAAGATAT TGCTCAGACA AAGTCATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4 620 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4 680 

TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 4740 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 4 800 

GGGACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATTAGC 4 860 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCATGCA TAAGaAATAC 4 920 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4 980 

GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 504 0 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 5100 

CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGG CGATA TGAACCATGT AAATTAAGCA 5160 

SO AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 5220 

TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 5280 
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ACATTAAAGT 


TAGATTTAAT 


CGCTGGTGAA 


GAAGGACTAT 


CGAAGCCAAT 


TAAAAATGCT 


5400 




GATATATCAA 


GAC CGGG CTT 


AGAGATGGCA 


GGTTATTTTT 


CACATTATGC 


GTCAGATAGA 


5460 


5 


ATACAAC-T. AT 


IV TV TV TV TV 

TAGGAACAAC 


GG AACTATCG 


TTTTACAATT 


TATTACCAGA 


TAAGGATCGC 


5520 




GCAGGTCGTA 


*T>^^^^^*^ TV TV TV J^fll 

TGCGTAAACT 


ATG CAGACCA 


GAAACGCCTG 


CAATTATTGT 


GACACGTGGA 


5580 


10 


ITGCAbC cac 


CAGAAGAATT 


AGTTGAAGCT 


GCAAAAGAAT 


TAAATACCCC 


A CTTATAGTT 


5640 


G CTAAAGATG 


/^/™» TV l»*»*H TV TV TV ^— » 

CGACTACAAG 


TTTAATGAGT 


CGCTTAACAA 


CGTTTTTAGA 


GCATGCACTT 


5700 




G LAAAbAC GA 


CAT C TTTA CA 


TGGTGTTTTA 


GTAGATGTTT 


ACGGTGTTGG 


TGTACTAATT 


5760 


15 


ACCGGTGATT 


CAGGAATAGG 


TAAAAGTGAG 


ACTGCGTTGG 


AATTAGTTAA 


ACGTGGGCAT 


5820 




AGATTAGTAG 


CAGATGATAA 


TGTAGAAATA 


CGTCAAATTA 


ATAAAGATGA 


ACTAATAGGG 


5880 




AAACCACCAA 


AGTTAATAGA 


ACATCTATTA 


GAAATACGTG 


GACTAGGTAT 


TATCAATGTT 


5940 


20 


ATGACTTTAT 


TTGGCGCGGG 


TTCAATATTA 


ACTGAAAAAC 


GAATTAGATT 


AAATATTAAT 


6000 




TTGGAAAACT 


GGAACAAGCA 


AAAGTTATAT 


GACCGCGTAG 


GTCTTAATGA 


AGAGACG CTA 


6060 




AG T ATTTT AG 


ATACTGAAAT 


CACTAAAAAA 


ACAATACCTG 


TAAGACCTGG 


TAGAAATGTT 


6120 


25 


GCGGTAATTA 


TTGAGGTCGC 


TGCAATGAAC 


TATCGATTAA 


ATATCATGGG 


CATTAACACG 


6180 




GCCGAAGAAT 


TTAGTGAAAG 


ATTAAATGAA 


GAAATTATCA 


AGAACAGTCA 


TAAGAGTGAG 


6240 




GAGTAGGTTG 


AATGGGTATT 


GTATTTAACT 


ATATAGATCC 


TGTGGCATTT 


AACTTAGGAC 


6300 


30 


CACTGAGTGT 


ACGATGGTAT 


GGAATTATCA 


TTG CTGTCGG 


AATATTACTT 


GGTTACTTTG 


6360 




T^TgCACAACG 


TG CACTAGTT 


AAAGCAGGAT 


TACATAAAGA 


TACTTTAGTA 


GATATTATTT 


6420 


35 


TTTAT AGTG C 


ACTATTTGGA 


TTTATCGCGG 


CACGAATCTA 


TTTTGTGATT 


TTCCAATGGC 


6480 


T^ft » TV /*v* 


GGAAAATCCAr 


-AGTGAAATTA"" 


■ TTAAAATATG" 


"GCATGGTGGA" 








CATATTAGGC 


"ATAGCAATAC 


654"0 




ATGGTGGTTT 


AATAGGTGGC 


TTTATTGCTG 


GTGTTATTGT 


ATGTAAAGTG 


AAAAATTTAA 


6600 


40 


IV ^Z'*/! TV f 1 N 1 H 1 TV 

ACCCATTTCA 


AATTGGTGAT 


ATCGTTGCGC 


CAAGTATAAT 


TTTAGCGCAA 


GGAATTGGAC 


6660 




GCTGGGGTAA 


CTTTATGAAT 


CACGAGGCAC 


ATGGTGGATC 


GGTGTCACGC 


GCTTTTTTAG 


6720 




TV Tk TV TV Tt*l'i TV TV 

AACAATTACA 


TTTGCCTAAT 


TTTATAATAG 


AAAATATGTA 


TATTAACGGC 


CAATATTATC 


6780 


45 


ATCCAACATT 


CTTATATGAA 


TCCATTTGGG 


ATGTCGCTGG 


ATTTATTATC 


TTAGTTAATA 


6840 




TTCGTAAACA 


TTTAAAATTA 


GGAGAAACAT 


TCTTTTTATA 


TTTAACTTGG 


TATTCAATTG 


6900 




GTCGATTCTT 


TATAGAAGGA 


TTACGTACAG 


ATAGCTTAAT 


GCTCACAAGT 


AATATTAGAG 


6960 


SO 


TTGCACAATT 


AGTATCAATT 


CTTTTAATTT 


TAATAAGTAT 


AAGTTTAATT 


GTATATAGAA 


7020 




GGATTAAGTA 


TAATCCACCG 


TTGTATAGCA 


AAGTTGGGGC 


GCTTCCATGG 


CCAACAAAAA 


7080 
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TTATGGCGTG TATACCGTCT TGTTAAATTT 
GAATTTTCGA AATTTATTCC AAGTATGGTA 
5 AATATTAATA TCGGTAATCA ATCGTCGATA 

CCAGAACTGA TTACGATTGG TAGTAACAGT 
CATGAAGCAT TAGTTGATGA ATTTCGTTAT 

10 

ATTGGTGCAA ATGCTACCAT TTTACCCGGT 
GCTGGTACGG TTGTTTCAAA AGATATACCG 
TATATAAAAA TGATTAGGAG GTGACAATTT 

15 

AATGACTTTT GATGATGCAT TTTATCGTAA 
ATATAAACGA GCTGCTGAAT ACTTTGAAAA 

2Q AATTCAAATT GATTATGCAC AATGTCTAGT 

TTTATTTTAT GACAATATTA TTTATAATAG 
TCAGCTCAAC ATTGAAGTTA ACGAACCAAA 

25 TATTGTTAGC GACGACCAAG ATTATAGAGA 

TCAAAGTGAA GAACAAATTG AACTTGAAGC 
ATATCTTTTT TCTCAAGGTC GATTAAAAGA 

30 AGAAGTTCAA GATCATCGTG TAGTACGTAA 

TGAATATGAT ACgGCTAAAG CATTGTACGA 
ATATGCATTA TGCCATTATA CTTTGCTACT 

35 

AAAATATTTA AAAATATTAA ACAAAGTTGT 
AGGTATTGTA TTAAGTTATT TAAAG CAGTA 
ATATAAAAAA GGGAAATTTT TATCAATTCA 

40 

TTATTTAGGT GAAGAAGACG AAAGTCATTA 
AGTGGAAATT GGACATGCGC CTTGGGTAAT 

45 TATTTTGCCA TTACTTCAAA GTGATGACAG 

GGATCAATTA AATGGTAAAG AAATTGTGAT 
TCTAAATAAT TATGAGAAAT TGTATTTAAC 

SO ATTAGACTTC ATTCATCGCG GCTTATTAAC 

AAATGATGTA ATGGTTGCAT GGATTAATCA 

55 



TCGAAAGTTT TTAAGAATGT AATTATCATT 7200 

CTGAAAAGAC ATATATATAA ACAACTTTTA 7260 

GCTTATAAAG TAATGTTAGA TATTTTTTAC 7320 

GTTATTGGTT ACAATGTAAC AATTTTGACG 73 80 

GGACCAGTGA CGATAGGATC TAACACTTTG 74 4 0 

ATAACGATTG GTGACAATGT AAAAGTTGCA 75 00 

GATAATGGAT TTGCATATGG CAACCCTATG 75 6 0 

TATGGCGCAA AAGAATAATA ATGTAATTCC 76 20 

AATGGCTAAA CAGAAGTTTA AACAAAGAGA 768 0 

AGTGTTAGAA TTGTCACCTG ATGATCTGGA 774 0 

GCAACTTGGT ATTGCTAAAA AAGCAGAACA 7 8 00 

GCATCTAGAA GATAGCTTTT ATGAATTGAG 7860 

CAAGGCATTC TTGTTTGGTA TTAATTATGT 7920 

TGAATTAGAT CAAATGTTTG ATGTGAAATA 79 80 

TCAATTGTTT GTAGTTCAAA TACT ATT CCA 804 0 

TGCAAAGAAT TATGTCTTAC ATCAACCACA 8100 

TTTATTGGCA ATGTGTTATT TATATCTCGG 8160 

aGCACtATTA CAAGAGGATA GTACaGATAT 8220 

TTATAACACT AAGGAAAATG AACAATATCA 82 8 0 

ACCTATGAAT GACGATGAAA GTTTTAAATT 834 0 

TCGTGCATCA CAACAATTGT TGTACCCTTT 84 00 

AATGTACAAT GCTTTAGCAT ATAATTATTA 8460 

CTACTGGGAT AAATTGAAGC AAATTTCTAA 8520 

TGAAAATAGC AAAGAAGTTT TTGACCAACA 8580 

TCATTATCGT TTATATGGTA TTTTTTTATT 864 0 

GACGGAAAGT ATTTGGCAGG TTTTGGAAAA 8700 

GTATTTAGTT CAAGGTTTAA CGCTCAATAA 8760 

GCTTTACCAT AATGAATTAT TTGTAAGTGA 8820 

AGGTGAACTC ATAATTGCTG AAAAAGTAGA 88 80 
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TCGAAAOGTT 
CAAAATGATT 
TGCGCATAAT 
AGCAATTATC 
TTTAAAAACA 
AGTAGAGAAC 
TGAACACGCT 
TAAAGGCGAA 
TATTGCTACA 
ACGCGGTGTA 
CGTTATCGGT 
CAAAGTAACA 
AGCATTCAAA 
AAAAGACGGC 
ACACGAGGCT 
AGACTTAGGT 
AGTACCAGGT 
TGCTACTGGC 
CGATCAAGCT 
~ TTTT AATAGT 



50 



ATAGSAAACT 
TATGTTATAT 
TGATTTATTA 
AATACATGAG 
ATCGTTGTCC 
TGGGGCCCCG 
GCCCCGCCAA 
CGCCAACTTG 
CCCAACTCGC 



ACAAAGAAGC 

GAATTTCTCT 

GATTAATAAT 

GGTGCAGGTC 

GTTATGATTG 

TTCCCTGGTT 

AAAAAGTTTG 

TATAAAGTGA 

GGTGCAGAAT 

AGTTATTGTG 

GGTGGTGATT 

AT CGTTCAC C 

AATGATAAAA 

AAAGTGGGTT 

GATGGTGTAT 

ATTACAAATG 

ATTTTTGCAG 

GATGGTAGTA 

TAATTCGAAG 

GTCATCACAG~ 

AGAACTTAGT 

TAAACTTATA 

TGTAGTGGTT 

TAAAACTCAT 

CACCCCAACT 

CCAACTTGCA 

CTTGCACATT 

CATTGTCTGT 

ATTGCCTGTA 



AAATTACAAC 

TGAG CAT AT A 

GAGGAGGCGT 

CAGCTGGTAT 

AAAGAGGTAT 

TCGAAATGAT 

GTGCAGTTTA 

TTAACTTTGG 

ACAAGAAAAT 

CAGTATGTGA 

CAGCAGTAGA 

GTCGTGATGA 

T CGACTTTAT 

CTGTGACATT 

TCATCTATAT 

ATGTTGGTTA 

CAGGAGATGT 

TTGCAGCGCA 

TCGAATTAAG 

"CGTTAAAATAT 

ACGTATCATT 

ACTTTATGGG 

CTTAAACATT 

GCATAAGAAA 

TGCACATTAT 

CATTATTGTA 

ATTGTAAGCT 

AGAAATTGGG 

GAATTTCTTT 



ATGGTTAGGC 

GATTTATGAA 

TAATAAAATG 

GACTGCTGCA 

TCCAGGCGGT 

TACAGGTCCA 

TCAATATGGA 

TAATAAAGAA 

TGGTGTTCCG 

TGGTGCATTC 

AGAGGGAACA 

GTTACGTGCA 

TTGGAGTCAT 

AACGTCTACA 

TGGTATGAAA 

TATTGTAACA 

TCGCGACAAA 

AAGTGCAGCG 

ATGTTGAGCT 

ATGTCTTACT" 

TGTGCGTTTC 

AGTGGGACAG 

AGCCACAGCT 

TACTAATTTC 

TGTAAGCTGA 

AGCTGACTTT 

GACTTTTCGT 

AATCCAATTT 

TCGAAATTCT 



ATAACACAAT 
AAGTTAGATT 
ACTGAAATAG 
GTATACGCAT 
CAAATGGCTA 
GATTTATCTA 
GATATTAAAT 
TTAACAGCGA 
GGTGAACAAG 
TTTAAAAATA 
TTCTTAACTA 
CAG CGTATTT 
ACTTTGAAAT 
AAAGATGGTT 
CCATTAACAG 
AAAGATGATA 
GGTTTACGCC 
GAATATATTG 
GTAAATTATT 
TTTAAATTAAT 
AATGAGTTCT 
AAATGATAAA 
AATGTGTACT 
TATAGAAAAA 
CTTTCCGCCA 
TCGTCAgCTT 
CAGCTTCTGT 
CTCTATGTTG 
CTGTGTTGGG 



ATAAACTGAA 
TATTATATAA 
ATTTTGATAT 
CACGTGCTAA 
ATACAGAAGA 
CAAAAATGTT ' 
CTGTAGAAGA 
AAGCGGTTAT 
AACTTGGTGG 
AACGCCTATT 
AATTTG CTG A 
TACAAGATAG 
CAATTAATGA 
CAGAAGAAAC 
CGCCATTTAA 
TGACAACATC 
AAATTGTCAC 
AACATTTAAA 
TGGATATTTA 
"AGCAAATTAT 
AGTTTTTTTA 
GAGCCACTAA 
TAAAAATAGG 
GTATTACTTT 
GCTTCTGTGT 
CTGTGTTGGG 
GTTGGGGCCC 
GGGCCCACAC 
GCCCACACCC 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9460 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
r0T4"0~ 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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ACTCGCATTG CCTGTAGAAT TTCTTTTCGA AATTCTCTGT GTTGGGGCCC CTGACTAGAG 10800 

TTGAAAAAAG CTTGTTGCAA GCGCATTTTC ATT CAGTCAA CTACTAGCAA TATAATATTA 10860 

TAGACCCTAG GACATTGATT TATGTCCCAA GCTCCTTTTA AATGATGTAT ATTTTTAGAA 10920 

ATTTAATCTA GACATAGTTG GAAATAAATA TAAAACATCG TTGCTTAATT TTGTCATAGA 10980 

ACATTTAAAT TAACATCATG AAATTCGTTT TGGCGGTGAA AAAATAATGG ATAATAATGA 11040 

AAAAGAAAAA AGTAAAAGTG AACTATTAGT TGTAACAGGT TTATCTGGCG CAGGTAAATC 11100 

TTTGGTTATT CAATGTTTAG AAGACATGGG ATATTTTTGT GTAGATAATC TACCACCAGT 11160 

GTTATTGCCT AAATTTGTAG AGTTGATGGA ACAAGGAAAT CCATCCTTAA GAAAAGTGGC 11220 

AATTGCAATT GATTTAAGAG GTAAGGAACT ATTTAATTCA TTAGTTGCAG TAGTGGATAA 11280 

AGTCAAAAGT GAAAGTGACG T CAT CATTG A TGTTATGTTT TTAGAAGCAA GTACTGAAAA 1134 0 

20 ATTAATTTCA AGATATAAGG AAACGCGTCG TGCACATCCT TTGATGGAAC AAGGTAAAAG 114 00 

ATCGTTAATC AATGCAATTA ATGATGAGCG AGAGCATTTG TCTCAAATTA GAAGTATAGC 114 60 

TAATTTTGTT ATAGATACTA CAAAGTTATC ACCTAAAGAA TTAAAAGAAC GCATTCGTCG 11520 

25 ATACTATGAA GATGAAGAGT TTGAAACTTT TACAATTAAT GTCACAAGTT TCGGTTTTAA 11580 

ACATGGGATT CAGATGGATG CAGATTTAGT ATTTGATGTA CGATTTTTAC CAAATC CAT A 11640 

TTATGTAGTA GATTTAAGAG CTTTAACAGG ATTAGATAAA GACGTTTATA ATTATGTTAT 11700 

GAAATGGAAA GAGACGGAGA TTTTCTTTGA AAAATTAACT GATTTGTTAG ATTTTATGAT 11760 

ACCCGGGTAT AAAAAAGAAG GGAAATCTCA ATTAGTAATT GCCATCGGTT GTACGGGTGG 11820 

ACAACATCGA TCTGTAGCAT TAG CAGAACG ACTAGGTAAT TATCTAAATG AAGTATTTGA 11880 

ATATAATGTT TATGTGCATC ATAGGGACGC ACATATTGAA AGTGGCGAGA AAAAATGAGA 11940 

CAAATAAAAG TTGTACTTAT CGGTGGTGGC ACTGGCTTAT CAGTTATGGC TAGGGGATTA 12000 

AG AGAATTC C CAATTGATAT TACGG CGATT GTAACAGTTG CTGATAATGG TGGGAGTACA 12060 

GGGAAAATCa GAGATGAAAT GGATATACCA GCACCAGGAG ACAT CAGAAA TGTGATTGCA 12120 

GCTTTAAGTG ATTCTGAGTC AGTTTTAAGC CAACTTTTTC AGTATCGCTT TGAAGAAAAT 12180 

45 CAAATTAGCG GTCACTCATT AGGTAATTTA TTAATCGCAG GTATGACTAA TATTACGAAT 12240 

GATTTCGGAC ATGCCATTAA AGCATTAAGT AAAATTTTAA ATATTAAAGG TAGAGTCATT 123 00 

CCATCTACAA ATACAAGTGT GCAATTAAAT GCTGTTATGG AAGATGGAGA AATTGTTTTT 123 60 

50 GGAGAAACAA ATATTCCTAA AAAACATAAA AAAATTGATC GTGTGTTTTT AGAACCTAAC 12420 

GATGTGCAAC CAATGGAAGA AGCAATCGAT GCTTTAAGGG AAGCAGATTT AATCGTTCTT 12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 126 00 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 126 60 

CCGTTTATTG ATTATGTCAT TTGTAGTACA CAAACTTTCA ATG CTCAAGT TTTGAAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 12 780 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 12840 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 12 90 0 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AAT CAT ATT A TGATATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13 020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13080 
AGACGT 

(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



13086 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 6 0 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 120 
ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 180 
-GTTAGCTTCT-CTTGTTTTCG-GAATGAATC^^ 240" 



AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 300 

CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 360 

40 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 420 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 480 

45 CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 540 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 600 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 660 

50 TCTGCCACGT ATAATGTCTG CTGCTTTTTC AG CTAACATT AAAACAGGTG CGTGTATATT 720 

GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT C CAT ACCGTG 78 0 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 900 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 102 0 

TGTTGATAAA TAATTAAAGC GGATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT 108 0 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTG CG AC 114 0 

10 

CGCTGCCTTT TGACCATCAT ATCTTACAGC TATTGGTAAG AAATGGAACA TTAAGTTAGG 12 00 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 12 60 

1S TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 13 20 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
(2) INFORMATION FOR SEQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

TAATGCTATT GGCAACACCA TATATGAAAn CTCCAAACGA TCCTAAACCG ACTATAGATT 6 0 

30 

CACCAAATTT nACAAT C CAT GAATAAAGTA GTGGCCATAA GAATAACAAT ATGACAACTA 12 0 

AAAATGTACA GTAAAATGCA GT CAT AATTG GAACTAGACG TTTACCACTA AAAAATGATA 180 

ATG CTAATGG TAATTCTGTT TCACTAAACT TATTGTATGC ATAAGCTGCT ATTAAACCTA 24 0 

35 

TTACAATACC AACAAAGACA TTGCCATTAT TCATCTTTTC AAAAGCTGAA TTTATTTCCG 300 

ArGCTTTCAT TCCTAATAAA GGCGCTAATT TCATTGGTGA TAATACAACT GTAACTAAAA 360 

AATATCCTAA CGTrGCTGCA rGCGsGACTG CACCATCATT TTTCTTTGCC ATTCCTATAG 420 

40 

CTACACCAAT TGCAAATAAA ATACCTAATT GCTCTAAAAT CGTAGTACCT ACCGTAGTAA 480 

AG AACATTG C GATTTTCGGC GTCGCATGAA GTGCATTTAA CGTATTACCA ATTCCGGCAA 54 0 

45 TAATTGCTGC AGCCGGTAAA ATGGCAACTG GTAACATTAA CGAACGCCCT AAATTTTGGA 600 

AAAATTTATA CATTGAATGT CATCCTTCTT AAAATAATGT AGAAATATAA AGATTACTAA 66 0 

TGTAACTAGA ATAACTACTT CGATACTCCG TTATAGTCAC CTAGGCTTAC TAACCAGCTA 72 0 

50 TATTTCTACC TCAAGTTATT TTATAAACTT TTTACAATTT CATGCAATTC TTGTTGTAAC 760 

TTTGCTGTTC GTGTTTCAAT CTCTTTTGTA ATATAATCGA TACGCTCGTT TCGTTTTAAA 840 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT 
ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT 
5 CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT 
CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA 

70 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT 
TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC 
AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT 

75 

(2) INFORMATION FOR SEQ ID NO: 19: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTGTCATACC AATATTTTGT AAAATATGGA ACACAAGTAA AGTGACGAAA CCAACGATAA 60 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 120 

30 ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 180 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 24 0 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 3 00 

35 

TACTATATGT— ATAGGGGTCA~AGCCATGAAT~TGATTCGTCC~CA 360 

CTAAJAAATT TTCAATTAAT GCGGGTG CAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 4 80 

40 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TGCAGTAATA 540 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 600 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 660 

45 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 720 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 7 80 

SO ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 640 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 



GACTTACTTA 960 

GTGCGAGAAC 1020 

CGTTTTAAAG 1080 

AAAAATTTAC 114 0 

TTAGGATATA 1200 

ACGACATATT 1260 

ATTTT AACTA 13 2 0 

TAACTT 1376 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA CCCAGTCTAC TTTGCGAAnC aATGCTTATC 1020 

CGGCTGTTGA CGAGATGAAT AATTCATTGC AAACTCCTTT TATACTCACT AATGTTTATA 10 80 

5 

TCAATTTTAC ATGACTTTTT AAAAATTAGC TAGAATATCA CAGTGATATC AGCTATAGAT 114 0 

TTCAATTTGA ATTAGGAATA AAATAGAAGG GAATATTGTT CTGATTATAA ATGAATCAAC 12 00 

ATAGATACAG ACACATAAGT CCTCGTTTTT AAAATGCAAA ATAGCATTAA AATGTGATAC 12 60 

10 

TATTAAGATT CAAAGATGCG AATAAATCAA TTAACAATAG GACyAAATCA ATATTAATTT 1320 

ATATTAAGGT AGCAAACCCT GATATATCAT TGGAGGAAAA CGAAATGACA AAAGAAAATA 13 80 

TTTGTATCGT TTTTGGAGGG AAAAGTGCAG AACACGAAGT ATCGATTCTG ACAGCACAAA 1440 

15 

ATGTATTAAA TGCAATAGAT AAAGACAAAT ATCATGTTGA TATCATTTAT ATTAC CAATG 1500 

ATGGTGATTG GAGAAAGCAA AATAATATTA CAGCTGAAAT TAAATCTACT GATGAGCTTC 1560 

20 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT CACAGCTATT GAAAGAAAGT AGTT CAGGAC 1620 

AACCATACGA TGCAGTATTC CCATTATTAC ATGGT CCTAA TGGTGAAGAT GGCACGATTC 16 80 

AAGGGCTTTT TGAAGTTTTG GATGTACCAT ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 1740 

25 GTTCTATGGA CAAACTTGTA ATGAAACAAT TATTTGAACA TCGAGGGTTA CCACAGTTAC 18 00 

CTTATATTAG TTTCTTACGT TCTGAATATG AAAAATATGA ACATAACATT TTAAAATTAG 18 60 

TAAATGATAA ATTAAATTAC CCAGTCTTTG TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

30 GTATCAGTAA ATGTAATAAT GAAGCGGAAC TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTGAC CGTAA GCTTGTTATA GAACAAGGCG TTAACGCACX3 TGAAATTGAA GTAGCAGTTT 2040 

TAGGAAATGA CTATCCTGAA GCGACATGGC CAGGTGAAGT CGTAAAAGAT GTCG CGTTTT 2100 

35 

ACGATTACAA ATCAAAATAT AAAGATGGTA AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

ACGAAGATGT TCAATTAACG CTTAGAAATA TGGCATTAGA GGCATTCAAA GCGACAGATT 2220 

GTTCTGGTTT AGTCCGTGCT GATTTCTTTG TAACAGAAGA CAACCAAATA TATATTAATG 22 80 

40 

AAACAAATGC AATGCCTGGA TTTACGGCTT TCAGTATGTA TCCAAAGTTA TGGGAAAATA 234 0 

TGGGCTTATC TTATCCAGAA TTGATTACAA AACTTATCGA GCTTGCTAAA GAACGTCACC 2400 

4S AGGATAAACA GAAAAATAAA TACAAAATTG ACTAACTGAG GTTGTTATTA TGATTAATGT 24 6 0' 

TACATTAAAG CAAATTCAAT CATGGATTCC TTGTGAAATT GAAGATCAAT TTTTAAATCA 2 520 

AGAGATAAAT GGAGTCACAA TTGATTCACG AGCAATTTCT AAAAATATGT TATTTATACC 25 8 0 

SO ATTTAAAGGT GAAAATGTTG ACGGTCATCG CTTTGTCTCT AAAGCATTAC AAGATGGTGC 264 0 

TGGGG CTGCT TTTTATCAAA GAGGGACACC TATAGATGAA AATGTAAGCG GGCCTATTAT 2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TGAAAGTGTA TTGCATACCG AATTTAAAGT TAAGAAAACG CAAGGTAATT ACAATAATGA 288 0 

AATTGGTTTA CCTTTAACTA TTTTGGAATT AGATAATGAT ACTGAAATAT CAATATTGGA 294 0 

GATGGGGATG TCAGGTTTCC ATGAAATTGA ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TGCAGTTATA ACTAATATTG GTGAGTCACA TATGCAAGAT TTAGGTTCGC G CGAGGGG AT 3 06 0 

TGCTAAAGCT AAATCTGAAA TTACAATAGG TCTAAAAGAT AATGGTACGT TTATATATGA 312 0 

TGGCGATGAA CCATTATTGA AACCACATGT TAAAGAAGTT GAAAATGCAA AATGTATTAG 318 0 

TATTGGTGTT GCTACTGATA ATGCATTAGT TTGTTCTGTT GATGATAGAG ATACTACAGG 324 0 

TATTTCATTT ACGATTAATA ATAAAGAACA TTACGATCTG CCAATATTAG GAAAGCATAA 3 300 

TATGAAAAAT GCGACGATTG CCATTGCGGT TGGTCATGAA TTAGGTTTGA CATATAACAC 3 36 0 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT AACTGGTATG CGTATGGAAC AACATACATT 3420 

AGAAAATGAT ATTACTGTGA TAAATGATGC CTATAATGCA AGTCCTACAA GTATGAGAGC 34 80 

AGCTATTGAT ACACTGAGTA CTTTGACAGG GCGTCGCATT CTAATTTTAG GAGATGTTTT 3 54 0 

25 AGAATT AGG T GAAAATAGCA AAGAAATGCA TATCGGTGTA GGTAATTATT TAGAAGAAAA 3 600 

GCATATAGAT GTGTTGTATA CGTTTGGTAA TGAAGCGAAG TATATTTATG ATTCGGGCCA 3 66 0 

GCAACATGTC GAAAAAGCAC AACACTTCAA TTCTAAAGAC GATATGATAG AAGTTTTAAT 372 0 

AAACGATTTA AAAGCGCATG ACCGTGTATT AGTTAAAGGA TCACGTGGTA TGAAATTAGA 37 8 0 

AGAAGTGGTA AATGCTTTAA TTTCATAGAG ATTAGTCGAG GGACCTTTTA CTTATAAAAA 3 84 0 

TGATTTGAAT TAATACTAAA AGATTACAAA GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 900 

- TTGCCTTTTT~CTTTTTATGT~T 3960 

GTACACACTT TATATAGGAA GTAGTTTGAA TGTTTATATA ATGTTTTACA AAAAGATGTA 4 020 

GTATTATAAT GTCTAATTTC ACATGTGTTT CAGTAAAATT TGTTGTGGAA TGTTAACGAT 4080 

ATACGTATTT TATAAAAaAT TTTTTATAAT GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

CTTATCTAAT GCTAGCTTTT TGACAAAAAT ATGACAATCA ATTAATGTGA TTCTAATAAA 4200 

45 TATTCGCAAA TTGCTTTATT GCGATTAAAT TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

AAATATTAAT GAACTTATAT G CAAAAGT AT ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 432 0 

TATTTTGCAA AATTTTAAAG AACTAGGGAT TTCGGATAAT ACGGTTCAGT CACTTGAATC 438 0 

SO AATGGGATTT AAAGAGCCGA CACCTATCCA AAAAGACAGT ATCCCTTATG CGTTACAAGG 444 0 

AATTGATATC CTTGGGCAAG CTCAAACCGG TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4 500 
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AG AATTGG CA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 462 0 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4680 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 474 0 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 4 800 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 4 860 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4 920 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT C CACAAATCG AAGAATTCTA 4980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 504 0 

AC CTGAATT A GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 5100 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

20 TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 5280 

AGATACTGAA AGCTATACAC AC CGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 534 0 

25 CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 54 00 

CGGTAGAAAA ATG AG TGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AG CACGTGAA 54 60 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 552 0 

CGCATTTCTA CAG AG TTGTT AAATGAATAT AACGATGTTG ATTTAGTTGC TG CACTTTT A 5580 

CAAGAGTTAG TAGAAGCAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 564 0 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

AATCCTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

ACAAAAAAAT TCGACCGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGTCTTGA 5880 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 594 0 

GTTAAATATT TAATTGGATT GAG AT CTGT A TG CGGTT ATA TCaTTCTGTG TAAATATGGT 6000 

TCTCCACCAA ATG TGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTTCAGTA TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

SO TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCAATGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 6420 

AGTTCTAACA AG CTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 64 80 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6540 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTG CCC GAAGCTCATC AACATTAAAA 6600 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 666 0 

10 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 67 8 0 

15 TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 6840 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6900 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 696 0 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 702 0 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 714 0 

25 CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 7200 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATG TGTATA AAC t GTAAAT TTACTGAGGA TGATACAGTT AT ACG CTTTT 73 20 

30 

TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 7363 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 20: 



TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 60 

45 CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 120 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 180 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGG CAAATGT 24 0 

50 TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 300 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 360 
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ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTT AG T CAT 540 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA AT CAGTG ATT 600 

AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 660 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAG CTGATCT AACAATCCAA 72 0 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 7 80 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGJcGG TaTCATACCG 1020 

20 ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 10 8 0 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 114 0 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 12 00 

25 TTACAACAAA ATGGTAT CAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 126 0 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 13 80 

GAATCT CTTC GGCAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 1440 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGTTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 16 80 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 174 0 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT I860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 2040 

50 CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 



55 



30 



35 



40 



45 



WSDOCfD: <EP_0786519A2_I_> 



272 



EP 0 786 519 A2 



GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 22 80 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAG a CATG AC 2340 

5 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 24 00 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 2460 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

10 

ATAACCCACT GGACAAGATA AGAATGG CCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 25 80 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 2640 

1S ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2 820 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2B80 

• TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCGCATTCT TGGAAAGTTG CCTGTTCATT 294 0 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3 000 

25 TTTATTACCT ATTTGATTAG CGGTATG CCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3060 

AACTGTGTTG CCTGATACGT AACTATG CGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

30 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3 300 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 336 0 

3_5 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 34 20 

T T GTT CGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3540 

40 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

45 TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 37 80 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 3 84 0 

50 CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3 900 

ATTCCGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3 96 0 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA 
TAGCTTGATT TAATTGAGTA GATAAATCTA 
5 ATAGCAACAA TGCTCCAACT AAACCAGTTA 

AGTTAATATC CATTTGTTTG CTCCTTTTAT 
GTTTAAAATT ATTCAATGGT CAATGTCGGA 

W 

TACAACATCC CTGAAGGATT ACTAAAGTTG 
CCTGATATTC CTAAATCACT TGACCCTAAA 
CGTACATTTT CTATTGTCAC CTGATAACTT 

IS 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA 
AAAGATGAAG GCTTTTTCCA TACTTGGATA 

2Q CCTTCGTAAA TAAACTTCTT TACATTTTTA 

TTAAATATAA CGTATTCGGG TCTTTTTGAT 
TCCAAATTTT AACCGTCGGT TGAGATGCGC 

25 GTTTAGTAAA AG CTTG AGAT GACAAAACAT 

CATATAAATC ATTTAGTGTT TGTTTGAATT 
CAATCTGTTG CAATACACTT TCTGAAATAG 

30 TTAATGTGTT CAT AG ATT CA GGCGCGCTAT 

ACGTTTTAGA GTCGTTGAGA GTTGTAT CTT 

CTAACCCTTC AACATTTGCG ATATTGATTT 

35 

CTGTATCTTT ACCATCAATT TGCCACATTT 

CGTTTTTACC GGGTGCGCCT TGTTCTCCTT 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC 

40 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT 
TATCTTTTAA AATTCTCGTA ATAGCATCAT 
CAG CAG ATT C AATACCACTA TCAACGATAT 

45 

CTTCTTCTTT CTCTAAAAAC AG CTT A CAG C 
TAGGTATCTT GTAGGTAAGG AAACCTTTTA 
SO TGAATATAGA GCCATCTTCC ATAAACAAAT 

GATCGATACG ACCTTGTTTG TCATTGATAC 

55 



GTAACGTGAG GATAGCGCCT ATAATTGCGC 4 080 

ATCCGAATAA ATCCGTGACT TG CTTG AT AA 414 0 

GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

CCAAAATAAA AAAACGACTA AAAATTAGTC 4 260 

GATCCTGAAT AAACATCACT TATAGTGACG 432 0 

ATATTTTTAC TTGCAACTCC GCTATTGACT 4 3 80 

TTAGTTTGCG AAATCCTCAT TATACCGCTA 444 0 

TTATTGGGTT CAACTCCATT TATTGTCCAT 4500 

TATTTATTTT TAGGTAAGGG TTTTATTACA 4 560 

TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4 620 

AAATTACCTT CCATAAAAAT CACCCTTTAA 4 68 0 

ATATATAGTT ATATTCATTT TCTGTTCCTG 4 74 0 

TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4 800 

ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4 860 

CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4 92 0 

AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4 98 0 

CAACTAGTTC AGCAATTTTT GTATCCGTAT 5 04 0 

TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

TGTCCAATAA CTCAGGTTCT GCTTTGATAT 516 0 

TAGTGTCAGG ATTGATTGAT ACTACAGTAC 5220 

TTTTACCTGC TTCACCTTTT GCTCCAGGTT 52 80 

CTTTAAATCT ACTTTCATTC TTTTCGATGT 534 0 

TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 5400 

CTACCAATTT AACATCGATT TCTTTTGCTA 54 60 

TGAAAGAAAA GTTTGCGACA TGTATTTTTT 552 0 

GAACATAACC AG CG TGTTTG ATAACCTTTT 55 BO 

CAACATCGTC GATAATAAGG GGCTCATTTT 564 0 

GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 

CTATTCTTAT AGATGCTGTA TTTTCATCTT 576 0 
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CAACATCTTT 
AGATACCTTT 
ACCACTCTAT 
GGTACAATGT 
TGATTGCTTT 
GATTGTTGTC 
AGTTACTTGT 
TGTTGTGCTT 
CTTTAAGTAT 
TTTGGTAGCA 
GTGCTCGTGT 
TTGTTTATCC 
TTTAGATGCC 
GTTTTTATTA 
ATTTAATAAT 
TTTATCTTTA 
ACGCCAAATG 
CATCATATCT 
CGCACTATAA 



TATTTTGTAC 
TAAGCCGATT 
AGCAGTAGCG 
GTTTTGACCT 
TGCTGGAGAT 
GTAAGCGCCA 
TCCATATCCA 
TTTACAAGCT 
CTTATTAAGT 
TTGTGTGTTA 
AGCGTTASCC 
AGTGCTTGTT 
GAACCATTGT 
TCTACCATAA 
TTTTCTCTGT 
GGAGGAACAC 
CTGTCGTCTA 
ACAGCTACAA 
TCTTCACATA 



ATTTACACAC 
TGTTTATATA 
TATTGGTGGG 
TTATTGATGT 
GTCCAACCTT 
ATG CCGAAGT 
CTTTCTAAGA 
TCTGCGAACG 
GCGCTAACTG 
CTTTCC CATA 
AACCCCAAGC 
TGAATGTATA 
TGGCTGTTGG 
CTTTTATTCT 
TTTTAAATAT 
CGTTGTCAAT 
CTTTTAAATT 
CCATTGCGTG 
CGTCTATAAC 



CTCTTTATTT 
ACTTAGCGAT 
TAGCTGGATT 
AATCCTTTCT 
TATTCCTTGC 
AGTTGTATAC 
AAGCATGCGC 
CTTTACCTTG 
AAACACCTTG 
TACGCTTTAC 
ATTAGATTTT 
AGGACTCGTT 
TGACGAGTCT 
AGATTTTGTT 
ATTAAGTAAT 
CAT ATTC CAA 
TTCAATACTT 
AATCTCATTA 
TATATAATCA 



ATATTTATCC 
TGTACTTGCT 
CTTAGGATTC 
TACGAAGCTA 
AAACGTCATT 
TCCATCTTTT 
GATTAAATAA 
ATTATTCAAT 
ATACTTGCCT 
ATTCATTGCT 
TTCGGGTTAC 
TCTGTTATGA 
CTTACATTAG 
ACTGTTGGCT 
GCCTTTTCTA 
TTAACATGTT 
AGAGGTATCT 
AAAATAAATT 
GGTTCATTAG 



CTTGTGAAGT 
TGATGTTGGC 
CATCTAATTC 
GCACCGCCCA 
GCGTAGTTAG 
CCGTTAGCGA 
ATTTCATTAA 
GTTCCCTTAC 
AAATTAAGCA 
GAACTCGTTT 
CTCTTGCCAT 
TCTGCGGTTG 
CTATATCAGC 
TAGTTATAGA 
ATGCTTCGTA 
CCAACATTGA 
CATATTTGGC 
CATTTTTACT 
GAACTTCAAA 



5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020~ 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 
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50 



TACAGCTCTT 
ATCCTGTTTG 
GTTTCTAATC 
ATAAATATAT 
ATCTTTCCAA 
TTTAGGTGCC 
TCCACCTGGT 
ATTTGTATAC 
AATGTTTACA 
AAAATCTTTT 



CTAGGTGCCC 
TATTTCTTCC 
ATT ATTC CTT 
TCTGGATAAT 
ATTGGAACTT 
GGTGTAGTTT 
CCATCATGAT 
CAGTTTTGAT 
TGTCCTGCCC 
GTATTTCTAA 



AAATATTATG 
TACTGTTATA 
TAGGTTTTTC 
TAACCTCTTG 
TTTTATTATT 
TGTCTGGATG 
AAGAGTGTTT 
CTACGCCATA 
AACCACCAGT 
TTATCTTGAA 



TCTATCAACA 
TAAACTTTCT 
GAGTCGTCGA 
GCTAGAAATA 
TTTTTCGTTA 
ATATGGTGGT 
AATTTTATAA 
CCAATAGTCT 
CCAAACACCC 
ATCTCTACCT 



TAAAAGTGGG 
ACCGAGCTCA 
TTACCTTCTA 
GTGTACTTTA 
TCATCACTAT 
CTAACAAAAT 
GGTGGACTTC 
TTTGTGCATG 
CAGTCGCCTG 
CTATAATTGG 



GATATTCTAC 
TCGTTTGTGC 
CTATAAAGTG 
TAGTTGTTAC 
CATCTTCTGG 
ATTTAACCCC 
CTGTTGCGTT 
GTCCCACTAC 
GTTGTGGTAC 
ATTTTTGAGC 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAG CTA TTTTGTTTTT 7680 

AGCGATATAT AACGCCCATT CAACCACTTC ACTAG CTGTG GGCTTTCTAT TTTTCGGATT 774 0 

AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATC CT AAA 7 860 

ATCCCTTTAA GCATGGTAAT CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 7 920 

TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

TATTCTTGGC ATTCTTTTCT TTATTCTTTT CAT CTTCT A C CTTGTCGCGC TTTAATTCTT 8 04 0 

CAAAATTTCT ATCTAATTTG T CAT AAATCT TTTCTTGCGC TCTAAGACTA TCTTCTATTC 8100 

TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 816 0 

GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8220 

ACAAGCATTA CAC CTGTG AC TTTTCATCTT TTGTTTCTGG ATATTTTTCT C CAGTG ATT A 828 0 

AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 834 0 

TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCATCTTCA 84 0 0 

TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTT CAGTC AACTG TTGTG 84 6 0 

TT AACAT AG C GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 852 0 

TTTGCATACT CGCAACCATT CCGCGAAGTT CCTCATCACT TAAATCTGAC GCACTTTGTT 858 0 

GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTG CT ATTGTATTTA ATTTCGCCGT 864 0 

TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 870 0 

CATCTACATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 8760 

TATCGTTTAC TGTGATTTTC ATTATTTCCA CCCCATAATT TT AG TT AT AG TAACTTTGTT 8820 

GGCAJTCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 8880 

TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 894 0 

TGAGTCAACT ACATT CGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGT CCGAA 9000 

TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

TGATTTATGT TCATTAGGAA CTGTC CACTG TTGCTCAAGT CTG CCGTTTG TGATTGATCG 9180 

TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 9240 

AACGAATACC GATAAATAAC CCTCATAACT TTCAACG CTA CCTGGTAAAT CCGGCACTCT 9300 

TGTTGCATAG TAATTACCAG CAGTT AAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 9360 
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GAATTTATCA TCTACATACT GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 94 80 

AAATTGCTTA GTTAAGTTTC CATCATTCTT TTTATAAAAC GGGTACCATG TGCCGTAGAT 954 0 

5 

TTTGTATTTT GTGTACTCAT CGTTTGAATC GTCTGGGTAC CATGTTGCAC GAGCAGTATT 9600 

ATTATCAACA ACATAAACAA CTAACACACC AGATTTGCTT GATGTATAAG TTGATTCATC 9660 

GAACGAAGAA CCGTCATCAA CACCATCTTG TCCAGGCTTC TCTAACGTGC CTATATCCGT 9720 

10 

CTTTTCTGGC GCATCTGTTG CATTAGTAAT ATGAATAATC CTAGATGTGT TAACTGCGCT 97 80 

TAAAACGCTA TCTATGGACT GCTCATACGA TTCAATTGCT TTACCGTAAT CATCTGTAAG 9 84 0 

1S TTTAGACTTT TGCCAATTCG TTGTTGAATT ACCTTTAACA AGGTCAGCGC CATTGATTTG 9900 

TTGTTCAACT TCGTTAACAC GTTCAAAAAT CGCTTGCTCT TTTTCAACTA TTTTATCGAA 9960 

TTCAGCTGTA ACAGCTTGTG TTGCACTAGT TTGCGTCGCA GTAATAGCTT GTATAGCTTC 10020 

20 GTTTTGCTTG ATTTCGATTT GTTGAATGCC TTTTGTCGCA CTATCATTCA CTTTTG CT AT 10080 

TAACGTTTGT GTATCAGCCA TATTTTGCTT TAATTGGTTA AAATCTTTAC CGACAG CTTC 1014 0 

GATAGTATCT TGAATAGATT TGATATAAAC AAGCTTTGTT ATACCATCAA ACCCACTAAC 10200 

25 TAAATCATTT T CAATATTG A AG CTAAATTG ACGTTCAACA ACAACATTAT TACTCCCGTT 10260 

TTGTGTAAAG AATGCCTGAG CATG CACCTT GCCTGAATGT TTTAAAAATT CATTCGGTAT 10320 

CACATACTGC AAACGCCCAT TAATTGCGTC TACTATCGTT AATTCGTCTG AAATATAAGC 103 80 

GCCTCTATCT ACGTTATAAT CAT CGG TTTT TAAnACGATA GATGTTTTAA CATGTTCAGA 1044 0 

ACTTATAGAT AAGGGTCTGT TATnCTTAGT 10470 

(2) INFORMATION FOR SEQ ID NO: 21: 

35 

(T) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3647 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

45 ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 6 0 

CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 120 

AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 

so AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 24 0 

TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 300 
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TCAAATTGTA ACAACTAATC CTATTGCAGG TACGATTCAA CGTGGTGAGA CGACACAAAT 4 20 

AGATAATGAG AATATGAAAC AACTACTTAA TGATCCAAAA GAATGCAGCG AACATCGTAT 4 80 

G CT AG TTG AT TTAGGACGTA ATGATATTCA TAGAGTAAGT AAAATCGGTA CCTCAAAAAT 540 

TACTAAATTA ATGGTTATTG AAAAATATGA ACATGTTATG CATATCGTAA GTGAAGTCAC 6 00 

AGGTAAAATA AATCAAAATT TATCGCCAAT GACAGTTATT GCGAATTTAT TACCAACAGG 6 60 

TACCGTTTCA GG TGCACCAA AATTACGTGC AATTGAAAGA ATATATGAAC AATATCCACA 720 

TAAACGGGGC GTTTATAGTG GTGGTGTTGG ATAGATAAAT TGTAATCATA ACTTAGATTT 7 80 

TGCATTAGCA ATTCGAACGA TGATGATAGA TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 840 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

AAGCTTATTG GAGGTGAGCC CATGATCTTA GTTGTAGATA ATTATGATTC CTTTACATAT 9 60 

AACCTAGTGG ATATTGTTGC TCAACATACT GACGTCATTG TTCAATACCC TGATGATGAT 1020 

AATGTGCTGA ATCAATCGGT GGACGCTGTT ATTATATCTC CTGGTCCAGG GCATCCATTA 1080 

GACGATCAAC AGTTAATGAA AATCATATCA ACCTATCAAC ACAAA C C CAT TTTAGGTATT 114 0 

25 TGTTTAGGGG CTCAGGCACT GACTTGTTAC TACGGTGGAG AAGTCATTAA AGGCGACAAG 1200 

GTTATGCACG GCAAAGTTGA TACACTAAAG GTTATATCGC ATCATCAACA TCTGTTATAT 12 60 

CAAGATATAC CAGAACAGTT TTCAATTATG AGATATCATT CATTAATAAG TAACCCTGAC 13 20 

30 AATTTTCCAG AAGAATTGAA AATTACTGGA CGTACCAAAG ATTGTATACA GTCATTCGAG 13 80 

CATAAAGAAA GACCGCATTA TGGTATTCAG TACCATCCTG AATCATTTGC TACAGACTAT 144 0 

GGTGTCAAAA TAATTACAAA TTTCATTAAT CTAGTGAAGG AAGGATGAAA ACCATGACAT 1500 

TACTAACAAG AATAAAAACT GAAACTATAT TACTTGAAAG CGACATTAAA GAGCTAATCG 1560 

ATATACTTAT TTCTCCTAGT ATTGGAACTG ATATTAAATA TGAATTACTT AGTTCCTATT 16 2 0 

CGGAGCGAGA AATCCAACAA CAAGAATTAA CATATATTGT ACGTAGCTTA ATTAATACAA 16 80 

TGTATCCACA TCAACCATGT TATGAAGGGG CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

AGTCAAATAG TTTCAACATT TCAACGACTG TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

AAGTTATAAA ACATGGtAAT AAAAGTATTA CCTCaAATTC aGGTAGTACG GATTTG t TAA 186 0 

ATCAAATGAA CATACAAaCA ACAACTGTTG ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ACCTTGTATT CATTGGTGCA aCTGAATCAT ATCCAATCAT GAAGTATATG CAACCAGTTA 1980 

GAAAAATGAT TGGAAAGCCT ACAATATTAA ACCTTGTGGG TCCATTAATT AATCCATATC 204 0 

ACTTAACGTA TCAAATGGTA GGCGTCTTTG ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2260 

5 GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 234 0 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 2400 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAG CAT 246 0 

10 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTAT CAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2580 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 264 0 

15 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2700 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2 820 

20 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2 880 

GGTGCATCTA TGATTTTATT GATCGTTAAC AT CTTATCTG ATAAACAATT GAAAGATTTA 2 94 0 

25 TATAACTACG CTATAT CGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3 000 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3 06 0 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 324 0 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3300 

- 3S CATCAATAAA-GGATGTTACA-GCGGCCAGTC-AATTACCTAT-TGATGCX3ATA— GGTTTCATCC 3*360 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 3480 

40 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgG CACAGAA TCTATTGATT 3540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3647 

45 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 
SQ (B) TYPE: nucleic acid 

<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CcAcCTTGAC CACCTTTACG TGGAATCTTT TCmCCTkGAG CAACaTCGaT AATaTATATT 60 

GAAAgTCAAC AAGTTCTGGA CTAAATGTTG CTGCTAAGTT ATCGCCACCA GATTCTATGA 120 

AAATTAGTTC TATATCGTCA TGACGTTCTA ATAATTCGTC TATTGCTGCA AAGTTCATAG 180 

ATGCATCTTC ACGAATCGCA GTATGAGGAC ATCCACCAGT TTCAACACCA ATGATACGAC 24 0 

TTTCAGGTAG AACTCCTGAA TTTACTAATA TCTTTTCGTC TTCTTTTGTA TATATATCAT 300 

TTGTAATAAC GCCGATACTC ATTTCTTTTG AAAGACGTTT TACAACTTTT TCAATTAATT 36 0 

, 5 GTGTTTTACC TGCACCTACA GGACCACCAA TACCAATTTT AATCGGATTT GCCACAATTA 42 0 

TAACCTCCTA TGATATGAAA t TCTAACATT GaCGTTCTCA TGCG CCATTT GATTTAGTTC 4 80 

TAAACCAGGC GCTGTCATGC CAAAATCTGC TTCTTTTAAT TCGAAAATCT GCTTTCTTGT 54 0 

TCCTTCTATA TAAGGAATCA TGTGAGTAAC TATCTTTTGA CCAG CAGTTT GTCCAAGTGG 6 00 

AATAG CACGA ACAGCATTTT GAGTTAAACT TGAAACATTT TGATATAAAT AGTAATCAAT 66 0 

AATCGTTTCA ATATCTACAC CTAAATGATG GCCTAGCATA GTAAAACAAA TAGCTGGATT 72 0 

TnACTTTGCT TTCTTATCTT GCATTTGTTG ATGATACCAA GCAATCCATG GG C TAT t ATA 78 0 

AAGTTCTAAA GCCAATTTAA CCATGCGAGT CCCCATTTGT kTTGCACCAA CACGTGTTTC 84 0 

TTTAGGTAAG TTTTGrACAr ACATCAGTTT ATCTATGTGT AATACTTTTT GTGTATCATC 900 

ATTTTCCAAT G CAT CAT AAA CTAaACGCAT GGCTAAACCA TCAGAATAGG TAAGTTGCTC 96 0 

TTGTAAAAAC ATTTTTAACC AAGCAATAAA AGTATGATCG TCATGAATTA TATTTCGTTG 1020 

35 AATATATGTT TCAAGACCAA ATGAATGACT GAAAGCACCT GTTGGAAACT GTGAATCACA 1080 

GAACTGAAAT AATCTTAAGT GTGTATGATC AATCATGAGA ATGCCCTATA TGTCTGAAAG 114 0 

CCTTATTAAC TTTACGGTCT TCTCGAACAT ATGGGATGCC TAAACTTTTT AATAAATCTT 1200 

CAACTAAATA AT CATATTGT ACTAGCATTT CAGTCTCTGT AAATTGTGCT GGCAAATGAC 1260 

GATTTCCTAA TTGATGGGCT ATATCTC CCA TTTCTTGCAA TGTTCTTGGT TGAATCACTA 13 20 

AAAGATCTTC TGAATTAACA TCCACAATAA TCATATTATG GTCATCTGCG TATAAAATAT 13 80 

CTCCATATTG TAAGTCAATA GGTTGTTTTA AACGAATGCC TATTTCAGTG CCATGGTCTG 144 0 

TAACGACTCT TTGAATACGT TTAACAAGAT CTGAATTTTC AAGGTATACT TTTTCGACGT 1500 

50 GCTTTTGTTT TTCTGAATTT GACAAATTGG CAATATTGCC TTGGATTTCT TCAACAATCA 156 0 

TTCTATGTTC CTCCTAGAAT AAGAAGTATC TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 1620 

TACTTGTAAT TTTTTCTCCA T CTACATAT A CTTCATATGT TTGTGGATCA ACGTCTAATT 1680 
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GACGCACCAT GCGTTTTAAA TTTAATGCAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT 1860 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 1920 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 198 0 

CAATGTCAGC TAGTTTG CCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 2 04 0 

TTG CTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 210 0 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 216 0 

is GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 2280 

CGAATGCGAT ATCTTCAGGA ATAGC CGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 2 34 0 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 24 00 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 2460 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2580 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 2640 

CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACCGACATT AATCGG T AAA CcTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2820 

35 TGGTAATAGG-AGTTTGTAAT— GGGAeeTCTG-GTTGTTCAGG-ATTAATAAAA— TGAAGATGAG 2880 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 294 0 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTAGCT ATGGCGAAAA 3000 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3060 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 3120 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 318 0 

AGTCTTTTTC TATTTG AG CA AATAGATTCG TATCACCTAA ACGAATGGAA TCTCCAACAG 324 0 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 33 00 

50 TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3360 

TTCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA TAG CACG AAA 3420 

ACCAAAAATT TTACGTTTGC CAGCATATTC AACTAATTGA ACTTCTTTTT TATCCCCAGG 34 80 
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TTCGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 3 600 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 3660 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG TATCATTTGA CTGCCTCCTT TAAACAATAG 3720 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AG CCTCG ATT TCGATATCTG 3780 

TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTAC CAT AAC 3 84 0 

TCATTAACTC TGCAACGGTC TTACCATCGC GTG CACCTTC TAATAATTCA TCGCTGATTA 3 900 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 3 960 

15 CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4 020 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 4080 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 414 0 

TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 42 00 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AG TAT ATT AA TGAGTAATGA TTCAAAGGAA 4 2 60 

AGGTGAAAGA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 4320 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 43 80 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCG CCTATAC TTTTGCGCGT TTTATAAATT 44 4 0 

ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4 500 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT TAGATATTGT TATAACAATG ATAGCAACTT 4 560 

TATTAACGTT ACCAGTTGCT G CTGCAGTG A GAGAAGTTTT AAGACCATAT AAAGTTCCGA 4 620 

35 TGCTGACGAT GCCTTTTGTC ATTGTGACTT GGTTTACAAT TTTACTTTCA GGACAGGTTA 46 80 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 474 0 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4 800 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4860 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4 920 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4980 

CACTAGGATA TACATTTAAA ACAGCGATTA ACCCTTATAT TTCGACTTTT TTAGGTGTGT 504 0 

TATTAACAGT AGTGGTGCAA CTAGGTACAA CAACATTGCT TGAACCGTTT GGCTTACCTG 5100 

50 CATTAACATT G CCATTT ATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 5160 

ACAAAGTAGA TGCTTGATAG TTAAATCAAA CCTAATATTG TTTGAATATC ACCTTAAACT 5220 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 5280 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 54 0 0 

ATATGGATAA TTCGTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 5460 

AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 5520 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 5580 

CATATGGTGG TGAAGGTGCG AC CAT AG AAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 564 0 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GG A CAAATTT 5700 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 576 0 

GTGCATTAGT T AAT AG AG CA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 5820 

GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 588 0 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 594 0 

20 AAAGTGTTGG TTTTTTCTTA GTAGAC 5966 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 6 0 

3S ATATAACAAT TTCATTAGTA AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 120 

TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TAC CTAACTT 180 

AAAG5TGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT G CAT ATTT AA CTGTAGATGC 240 

40 AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 300 

AATCATACGA TATGTATACA AAATAATGAm AAACTGTmAA AAATGATTTG CCTTTAATAA 360 

ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT CAGCTATAGA 420 

AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 4 80 

CAATGTTAGT AATTTACTAT TGTGTTGATT TTC CATTATA AACGTCTTCC ACTTCTTTAA 540 

TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AGATTGATTA CTCATTTTGA 600 

TGTAATCACT GTCTATTAAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 660 

TCATCATACA TATACCATTA TCAGCTACTA ATTCTGAAAT ACCGCCAACA TGACTGGCTA 720 
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TTATTAAAAT AAACGTATCG TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 84 0 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTG TTGCTGACAA TCATTTAATG 900 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATATCT 960 

CTATTG CCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 102 0 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTG TTT ATTAGGCAAC ATTCCAACTA 108 0 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT TT AG CGTTTG CTTAACAGCG GGAACATCTG 114 0 

CAATACCATT ATGTATTGTG GTTAATTTCA ATCGATTAAA TCGATATTTT AACG CTAACT 12 00 

15 GTTTATCGAA ATCTGAAACA CAAATAATGC TATCTGTAAT AAGTGACATT AATTTTTCGA 126 0 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATC CATGTG 132 0 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTt CGCAATT CGTCcGACCG 1380 

TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 144 0 

CTAACACTTT GACAGCTAAA AT AT CTTGTT TAAAGTCAAT TGG AC CT ACT AAATGTTCGA 150 0 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTTCAAT CAT TGGTCCATGA TTGCCTACAA 1560 

TGACATAAAC ATCATTGTGT ACGCAAAAAT GGTTGG CGAG TTGAATGAGA TGTGTTTGTG 162 0 

CACCACCATT GT CTGCTTT A GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 1680 

AATGCTATAC TTTCAATTTC TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 174 0 

ATGCAAATCA ATGATGGCAC ATATTTCTTA ATG CCATTTG ATACTGTCTC AAGGGATTCC 18 0 0 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

35 AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

TATATATAGT TCCATCATTA AACTACCTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

TTGTTGCGGT GTTAAGTCAT ATCCACCTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 204 0 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

ACGTTCCGGG CGTGGTCCAA TAAAACTCAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

TAATTCATCA ATGCGTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GAT CAT CATC 2220 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 2280 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 234 0 

so CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 24 00 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 24 60 

TTCTAGTTTG TCTAATTTTC TTTG AT AGG C ATAACCCTTA TTATTATGGA CAGCTTCAAT 2520 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 264 0 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2700 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 276 0 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 282 0 

AGCGTACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 28 80 

CAAAAAATGT AAATGGCTTG TTATG CTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 294 0 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3 000 

15 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCG CCGTAA TATTTATCTA 3060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 318 0 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 324 0 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 3300 

ATGTCTTCAT GATTTGTTCA ACTG CATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 3360 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 34 20 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 34 80 

TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 354 0 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 36 00 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3660 

35 TTGGGTGTGT—TAATACTAAT-AATGCTTTCA-ACTGATGCAT— TAGACTGTGA-CATCATAACT 3-720- 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3 840 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4020 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 414 0 

50 TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4260 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4320 
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GCCAAATTGC GCGGCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 4 44 0 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTT CTG C 456 0 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4 62 0 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4 68 0 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 74 0 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 800 

75 TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTT TT G 4 86 0 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4 98 0 

TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 504 0 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5X00 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTG CAG AT AC AAAGTTGCAT AATACTAAAG 5160 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 522 0 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 528 0 

ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA TAGAAAAGGC 534 0 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAG CCATG AGTACTATAT 54 00 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 54 60 

35 CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 552 0 

ATTTTAT CTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 558 0 

T CTGTCG ATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 564 0 

AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

GCACCTCCCA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 5760 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 582 0 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 594 0 

so TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 

AGAACTT CAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6060 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 
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TTCCAATTGC GCAGTTGTTC AACATCATCA TCTTGTTTAA GTAATGCCAG TGGTACTTGA 624 0 

AGATTAAGAC ATCGTCCTGA AATATTAAAG CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 63 00 

TTATGAACAA CCGCTTCAAT TTCCTTATAA CTCAATGGCT GATACTTCAT GAGTACATCT 6360 

TGTTGAGAAA GACAAGGATA TGTACCTTGT GCAATTCTCT CTACAGAACA ACAACCACTA 64 2 0 

TAACTTGCGA CAACCTTTTC CCATACTTGA AAATGTGCTT CGCCTAAATC TTTTGTATAC 64 8 0 

AAATATTGTT CTGTATCACC ATGACACATT GTAATAAATG GCG CTTCTTG TCTTGT CTCA 6 54 0 

GTAGTCCATG GCAAGCGATG TTCTTGTTGT AACGTTTCCC ACCACACACC AAATGGAACT 660 0 

15 TTATGTTGCC ATGTACTAAT TGAATATTGT GTTTCATGGA TTTCTTGCAC TGGAACTTTC 666 0 

TTACATCCTA ACGCTTTCAA ACTTGTATAC CGATGCACAC CATCTATAAC CATATATCTA 6720 

CCATGTTGCA TCGCTGTCAC TAAAATAGGA TGACGTATAA AATCATCTGC TTCAATACTA 678 0 

20 CTTTTCGTTT TTTCCAATCT TAAAGGTTCG AATGTTTCGT GAAGATCAAT CTTATCTACT 684 0 

GGTAC CAATT TTAAATGTTC ATGAATATGA TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6 900 

GTTAAATAAA TAAATTCAGG ATGTGGATGG CTTAAGAAAT CGTGATGTGA AATAGACCAT 6 960 

CCGTATGCAC CTGCATATTT GAAAACAATA ACGTCGCCTG TACTGATTGC GTCTATCTGT 702 0 

ACTTCTCTAG CAAAGACATC TTTCGGTGTA CATAATTGAC CGACTAACGT TGTGTCCTGT 7080 

CTCGAAATTG AAACTTTTTC AAATGAATAT GGATTGTCCT TATAGCGATA AATGTCAAAA 714 0 

GGATGGTTAT GTTGCCAAGA TACCGGCAGT CTAAATTGTT GCGTACCTCC TCTTAATATG 720 0 

GCATACCAAG CACCATGTAC TTTCTTAATG TCTAGCACTT CTGTCACATA GTAACCAATA 7260 

35 TGTGC CACAA TAAAGCGCCC ACATTCAAAG TTCAATGTCA CATCTTCCAT TTCTTGCTCA 7 320__ 

ACGATAAGTG TTTTAAAACG TTCTACAAAA TTATCCCATT CAAATTGGTT AGTTAAATCT 7380 

GCATAGTTAA CGCCTATGCC ACCACCAAGA TTGATATGTT TGAGTGGAAA TCGATGTTTT 744 0 

TCAGACCATG CCTTTGCTTT TTTAAAATAA AGTTTCACTA CATCGACATG TAAATTCGAG 7500 

TCTAAATTGT TAGAAATAGA ATGAAAATGA AATCCATCTA GATGAATCTT TGGCATTGCG 7560 

AGCGCAgcTT cAATGACATC ATCAACTTCG TCTTCAGAAA TACCAAATTG TGTTGGGCGT 7620 

CCTGCCATAT GCAACGTTGC ATTGGGAAAT GGTCCTGCTA AATTAACACG CAATAAAATG 768 0 

TGTTGTGTCT TATCTTCATC TTCTAAGATG GCATTTAGCC GTTGTAATTC ATGCATACTT 774 0 

so TCAACATGAA TACGCTGAAC ACCTTCACTT ACTGCATATC TTAGTTCCTC GTCTGTCTTA 7800 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT GGTTTAAAAG CAAGACCTTT TGCTATTTCA 7860 

CCTTGAGATG CAACTTCGAA TCCTTCAACA TACTGACTAA TTGTATCTAG GATTTTTCGT 7920 
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TGTTGCAAAT GATGTTCCAG TCCGACTAAA TCATAGATAT AATGACAAAC TGGATGAGAT 8040 

TGTGCTTTTA ATTGTTCAAT AACAGGTTGA ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

GTTTAGACGT CGCTAGAGAT GCACTTAAAT GGCGATATAT TTTTCCGCGA TCATCACCTA 8160 

AAATAAATGT TTGTACACCT TGTGCCTGCC ATTTTGCAAT ATCTTCATCT TCACGTGGTA 8220 

ATGCACAAAA ATGTTTACCA TGTGCATTCA CAACTTCAAA AATATGTTGA ACATGTGATG 8280 

TTACTTGATC ATCACGCGTT TGCCATGGTA TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 834 0 

CTTCGACTAT CATGTCTAAA CCTTCGACTT GTGCTATATC GTCAATGGCC ATAACCCCTT 84 00 

75 CAACATCTTC TATCATGGCA ATCACCATAA TATGCTCATT AGCCATCTCC ATTGCATCAA 8460 

GTAATGGTGT ACGTCCAAAT CTTGCCATGC GACCACCATT CAAACTTCTT AATCCTTGCG 8520 

GGTAATAACG ACTTAATTTC ACAATATGCT CAACTGTCTC ACGAT CTTTA ACGTGTGGCA 8580 

CAATAATACC TCTCGCACCC ATATCCAACA CTTTAATGAT ATCTCTATCT ATCACTGCAG 8 64 0 

TGACACGTAC AATTGGTATA ATATGCGCTG CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8700 

TCTCATCATT AATCGCCACG TGTTCTGTAT CAATCACAAC AAAG T CAT AC CCGCTTGCTG 8760 

CGATAAC CTC GATCATCAAT GGGTCCGGTA TAGAATTAAA AATGCCATAA ACTGAATCAC 8820 

CATTGTTTAA TCTATGTTTC AGAGATAGTT GTTGCATCAT TGATACCTCC TACACCTAAT 8880 

GGATTTGTAA CATGATGAAT TCTTAACTCG GAGTCACTTA ATAATCGACG TGTCGTTAAC 894 0 

TTTTCAACTT GAATCGTAGG TTCAAACAAA TCGAAATGTT GAT AG TT ATT CAACTCTGGA 9000 

AATGCTTCTT GATACGCCTC GATGATGCCT TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

35 ATACCATATT GCTTTTCAAT AAATAAGATG ATTTCGGCGA TATTAATAAA GAAAAATGCA 9120 

TCATGTAAAA AGTCGCGTAC TAAACGTTCG TCATCTGTTT CAATAAATGA ATTACTATTC 9180 

ACTTTTTTAT GTGCTTCTGG CATTGGCTTT AATGT CAGGT GTGAAGCAGC TTCACTTAAA 924 0 

TGctCACGCT TAAAACGAAC ACCATCATGG AAATCTTTTA AGG CAATACG TGTAGGCCAA 9300 

CCATTTTGAT GAATGAGCAT CATATTTTGT GCATGCGATT CAAAGGCAAT ACCGTGATAA 9360 

TAAAGCATAT GAATCATTGG ACGAATCGCT ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 9420 

GAACCATATT GTTTAATCCA ATTTTCAATG AATGGTACAC CATCCTTATC ACTTG CAT AA 9480 

AGTGCATTAA ATGGTATCGC ATCCTCTTCA TCGATTAACA TATGATATAT ATTTTCACGC 954 0 

50 CATATAACAC CTAACGCACC ATAAACTTGA GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 9600 

AAATAAGACT GTCCTAAGAC TTC CCCTAGA AAAACTGTCT TTAATTCATC TTTTAAATAC 9660 

ATATCTTGTT GCTGTATCTG CTTTAACCAA TCCGTAATTT GCGCTGCATT TTCAATTGTA 9720 
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TATTTTGTCG TGTCTATTGG CGACATCGTA CGAATCGATT GTTGAGGGTG ATATAGCTCA 984 0 

TCACTTTCCC CTAACCATAG TACTGTGCCA TTAAGCCTTT CTTCAGCCAA ATCAACTTGG 990 0 

ATGACATGTT CAAACTGCCA TGGGTGTACA GGTATCATCT CAACATCATT TACATGTTTG 996 0 

CCAGATGCTT CAATTTGCTG TACAAAATGT TCATAAGTCT TATCGCCAAC TTGTTGACGT 10020 

AACATTTCGT TAACTACAAC ATTTCTTGAT ACCGTCGTTT CTACTTTATC TTTGTCGATA 1008 0 

GCTAACCACT GCAGTTTAAC GTTTGGTACA AAATCAGGAC CAAATTTCAA ATTATCACTC 1014 0 

AACGTAAATC CTAAACGTGA TTTGTAACTT GGATGATACT GATGCCCTTC CATCG CATAA 10200 

AATT CATAGT CGTTAAATGT CTCAGGTGTT GCTGGTGGGT TTGATTCTCG ATACTG CAT A 10260 

CTTTGCGTAT CTTTTAATTC TGTCTGTAAT AACTCGACAA TAAATTGTTC TAGCTTTTCA 10320 

TCATTTTTAG GAAATGTAAA TACAACCTCT CTCAATAATT GTGTATAGTC TGTTGTTGTA 10380 

TCTGCCTCAT CTCCTACGAC ACGCTCAATT GGTGATGTGA TACGTATACG ATCAAAG CTA 10440 

TGTGTCTTTT CAGCAGTAAA ACGATACTCT GAATCATGTC CTT CTATTGT AAAATGACCG 10500 

ACACCGTCTT GAT ATGACG C TTTATACACA ACAATATTCT CATAAATAAG TGATGATACC 10560 

AGTTGGTGCA TCACTCTAGT CTTTACACGA TTAAGAATTG TTTGATTCAC AATACGATAC 10620 

CTCCTTGTTA TGACAAATTG GATTTGGTAT ATGTGTATAA ATAGGGTTTG CACCACAATC 10680 

ATTCAATTTA CTCATCAAAT TCGCTTTAGC CGcAATGGTC GGCGTTTGAT ATAAATCTTC 10 740 

TACACAGTCA ACAAATACTG CGTTATTCGC GTATTCTTTT TTCCAAGTCA TAAGACGATG 10800 

CGCTACAAGT TGCCATAACA CAACTTCATT TCTAGTCGCT TTACCAATAG TTGATACTAA 10860 

ATGTCCTAAG TGATTTA<~TA PAACGTAATA TTTAAGACGA TGCCATGCTT CATCATGTGC 10920 

ATATACAACA GGGCTTGATG CTGCCACAAC ATTTGGCACA AGCTGTTTTT CAGTAGCAAT 10980 

CGTT5TAGAT AGACAAATGC CTTCAAGATC TCTGACAAAG CATACGTCGG GTATGCCATC 11040 

TTTTAATTCA ATT AATG TAT TTTGTACATG TGCTTCTAGA CTAATGCCTG TGTTACTAAA 11100 

CAGCTTTAAT ATCGGCAATA ATGTACGATT CAAATAACAT TCAAGCCATG CTTCTGGTGC 11160 

TAAACCACTT TGCTCAATCA CTTGTGATAA CTTAGACATC GGTGAATCAG GCATCGTTTC 11220 

AAATAATGAC GCCAATACAT GAATATCTTT ATCAGCATGG TAATTCGGTA TCCCTTCACG 112 80 

AACAATCATG GCACTATTTG TTAATAAATC CATTTCAGGT TCAACTGTTT GCCCTAATGG 11340 

ATTCGGTAAC AATGCACGAT ATCCTTCTTC AAACATCAAT TTAAAATGGG GTGTTTCAAC 11400 

CTCATCTTTG ACTGATGCGA TAACTTG CGC GGCATCAATT GTCCGTTCAA TCTGTTCAAG 114 60 

GTCATTCGTA CGTATAAAAT TAGTGATTTT AACGTGTATC GGTAATTTTA AATAAATGTT 11520 
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GCCAAGGTCT 


TTTATTAAAC 


CTTGTTCACT 


ATATTGCATA 


TACTGTGGAT 


GCTGTCGCAA 


11640 




CACATTGATT 


TGATAAGGAT 


GTGTTGGTAA 


TAAAATAAAA 


TCTTTGGGTA 


TCTCTGATAT 


11700 


5 


ATCTATGTCT 


GCTAATTGAT 


ACAACACTTT 


CTCAACCTGA 


TCTTCTTTAC 


CTTCTACATA 


11760 




GCGCGTGAGC 


AGAACATCTT 


GATGCACAGC 


TAAATAATGC 


AATTGGAATG 


ATGTATGACA 


11820 


10 


TTCGGGTGCA 


TATTTCTCTA 


AATCTG CTTC 


TGAAAACCCA 


CTTGCACTCT 


TAGGAGTCGG 


11880 




ATGAAATGGA 


TGACCTAAGT 


ATAAAGATTG 


TT CTGAAACG 


ATATAACGAT 


CCTCTACGTA 


11940 




GTCTATTGTG 


TTACTTTGCA 


AATAACGTGC 


CGTGCGATGA 


ATGCTATTAT 


CGATGTCAGA 


12000 


15 


CATAATTTGC 


GCCATATGTT 


GTTGCACTGC 


CGTTTGATTA 


TCTGCACTTT 


GAGCCATATG 


12060 




TTGCAAAATA 


CGCG CAATTG 


CTTCTTTATA 


AGTTGTTATT 


TTTTTACTTT 


TTCCATCGAT 


12120 




AAG CCATACC 


TCTGGATGAT 


ACATATGATG 


CCCCATCGCA 


GACCAATAGC 


GAAATTCACC 


12180 


20 


CGTTAAAG TT 


TCGAGCT CTG 


ATAATTGTAT 


AGACCATTGA 


TGATTTTGAG 


GTGGTACTTG 


12240 




ATATAAATTT 


TCTTCTCTAA 


AATATTCATT 


TAAAATGCGT 


TCGATAGCCG 


CATACGCTGC 


12300 


25 


ATGTTGTATT 


AATTCTTTAT 


TTTGCACTTT 


TTTGTTTCAA 


CTCCCATAAT 


TTCATTAATG 


12360 


TGTGATCGTT 


GATTTGATTA 


GTGATGGTTG 


AACAAATTAA 


AAATAAACTA 


CTTACTG CAA 


12420 




ATACTACGCC 


CATAACGATA 


AACGTAGTAG 


CTGGTGTAGT 


ATAACTTGTA 


ATGGCAGCGC 


12480 


30 


cACTaAGACT 


GCCAATAATT 


TGACCAACAA 


CTAACATACT 


GTTCGTCGTT 


CCAACAAATG 


12540 




TGCCTTTAAG 


TTGTTGATGA 


CACGCATTCA 


CGACAACAAA 


CATGACACTT 


TGAATCAATG 


12600 




CACTATATGT 


TAATCCTTGA 


AGTATTCTTG 


CAGCCATTAA 


AAACT CT AT A 


TTCGTCGCTA 


12660 


35 


AACCTTGCAG 


TATCGCACTA 


CAACCACATG 


CAATCGTGGC 


AAATATATAT 


ACTGATTTAA 


12720 




CATATGATTT 


ATCATTAAAG 


CGTCCCCATA 


AAGGCGCGCT 


TAATATCGAA 


GCCGTCCAAA 


12780 




ATGCGGACTG 


TAAAAATCCA 


ATCACACTAC 


GGTCATCTAT 


CGCTGTATGA 


TTCACTGATG 


12840 


40 


AAGCAAGTGG 


TGATAATGCA 


GTTAGCATGC 


CATACATAGC 


AAAGTTTGCT 


AAAACGCCAA 


12900 




CGATAATAAA 


TCGACATGTT 


TGTTGTGTGC ATAATAGACA 


TTGAAATGAA 


CGGCGAATAC 


12960 


45 


CTTTATTAAT 


ATTTGGTGTT 


TGTGATTTTG 


GCATATGTGT 


CGTTTCAATC 


AATTTTAATG 


13020 


CACCGAAAAT 


ACAGACAATA 


AAAGTAATAA 


CGGCAATACT 


CATCAGTAAC 


GCACTAAAAC 


13080 




CTAATATCGA 


AG CTGT AACA 


CCGCCAATTA 


ATGGCCCCAC 


AAGAGACCCT 


GCGCTGACTG 


13140 


SO 


AACTTTGCAG 


TCTTCCTAAT 


ACCTTTCCAC 


GATCTTCAGC 


TGGCGCCTCT 


GCACTCGCAA 


13200 




ACGCACTTGA 


TGCATCAACA 


ACACCACCAA 


ATAGTCCCTG 


CAATAACCTC 


ACAAGTACAA 


13260 




ACTGTAATGG 


TGTCGTACAC 


AATGCCATTA 


AAAATAAGCA 


TACCGCCAAA 


CCAAGTAACG 


13320 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13 50 0 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13 620 

CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 136 80 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13 740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13 80 0 

AATAAATCTT G AATTG CATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13 920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13 980 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14 04 0 

TTTTGACCAT GCAACTCTGG TAATG CGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 1416 0 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 142 8 0 

30 CCTAATTGAT CTTTGAAAAT ATCATTATCT TG AC C CAT AT ATGACCACCA AGCTGTTTCA 1434 0 

TCACAAACCA TGACATACTT AGCTAGTGCT TCAT CTTTTT CTATAAGCTG ACGTAATAAT 144 00 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAG CCTTAA TGCGCCTAAT 144 6 0 

GACTCC^TTG~CAAATGG TTATACGGTG CGCCAATATCP AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14580 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AG AATATG CT GATATTGCCA AGGATGTACC 1464 0 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14760 

45 ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14 880 

CGATCTTTTA AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14 940 

50 TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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TATATCAAAA GCGTTTGTCC GTTTTCTTTA GTAATCTCAC TATTCGATAC AATTCCGGCT 15240 

AT ATCTT CAA ATAATAATGC ATCAACTAAA TCTCTTAATA TTATCGCTTG TGCTGTATTG 15300 

ACTGCTGTAT GATTCTGCAA TGTTCAGACA CCTCGCATTC TTAATATAGG TTCAATGTTG 15360 

TCCCAATATT TTGTTGTTGT GCCTGTTGAT AAATAAAATA AG CACTTG AA ATATCTTCGA 15420 

TAGCCATACC CATCGGATTA AG TAAT ATG A TCTCATCATC GTCTTCACGT CCTGGTATGT 15480 

CACCTGTCAC AAGTTGTCCT AGTTCAG CAT GAAGAGCTTC TTTGCTGAAT TTACCTTCTA 15 540 

ACACCAATTG GTTAATAGTT TTCTTTTCTC GATTACATTG TGACCAGTCA TCTACTACGA 15600 

75 CTTTGTCAGC TTTAATAAAG ACTTCTTTAT GCACATCCAT GATAGAAATG TTG CTAATAA 15660 

ATGCACCCTT TTGTAACCAA TCATATTCAA TGTATGGTTG ATCCGTTACG GTACATGTAA 15720 

TGACTACTTC ACCATTTGAT ACTGCTTCTT TAG CATTTT C TGTCGCAATA AAATTAATTT 15780 

CCGGACGCTG TTGTTG C CAT CTATCAACAA AGCGTGCACA TGCTTCAGAG AATTGATCGT 15840 

AAACAAACAC GCGTTCAATA TGATCGAATT GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 15 900 

CGATTAGCCC G CATCCAATG ATTGTTAAGT CTTTAAATCC TTTTTTAGCC AAATGCTTTG 15960 

CTGCAATCAC TGAAACTGCT GCAGTACGCA TACTACTAAT TAAACTTGCT TCCATAACTG 16020 

CAATTGGATA ATTCGTTTCT GGATCATTCA AAATAATGAC GCCACTTGCA CGCTCCATAT 16 080 

TACGTTTCGA TGGATTGTCG TGCTTACTAC CTATCCACTT AATACCTGAA ATTGCGTGTT 1614 0 

CACCACCGAT ATG ACTTGG C ATTGCAATAA TTCGATCTGC GATGTGTCCA TTTTCAGGAT 16200 

CC t GTCTTAA ATACGG CTT A AG CGG TTGTA CAAAATCATT GTGCGCATGG GCTGTTAATG 16260 

CTTCTGTTAA TGCGTCCACA TAAACTTGTG AATGATTACC TCCCGCTTGT TCAATATCTG 16320 

ATCTATTTAA ATACAACATC TCTCTatTCa TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 16380 

TTTTTCTAAC CATGTATCTG AATAAACTAA ATCTAAGTAA CGATCGCCTC GATCTGGTAA 16440 

AATCGTGACA ATTGTTGCAC CTTCTTCAAT TGACGTTATC AACTGCTCAA TCGCTGCAAT 16500 

AATCGAACGT GTTGAAcCTC CGGCAAATAT GCCTTCATAA TCAATCAGTT TTCGACAGCC 16560 

CAAAGCAGAT TGATAATGAT CTACATGGAT CACTTGATTA ATTTCTGATC TATTCAATAT 16620 

TTCGGGTACA CGACTAGCAC CG AT AC CAGG TAATTCTCTA TTAATAGGTT TGTCACCAAA 166 80 

AATGACTGAC CCTTTCGCAT CAACAGCAAC AATTTGTGCG TTTGGATGCA CTTCTTTTAT 16740 

SO TTTTCTACTC ATACCCATAA TGCTACCTGT CGTGCTGACT GGCGCGACAA AATAATCTAT 16800 

AGGTTGCTTA ATTGTTTCAA CAATCTCTGT GCCTGCACCA TGATAATGGG ATTGCCAATT 16860 

TAACTCATTC GCATATTGAT TAATC CAATA TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 16920 

55 



30 



35 



40 



45 



NSDOCID: <EP 078651 9A2J_> 



292 



EP0 786 519 A2 



10 



15 



TACATTGG C A CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17040 

AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 17100 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 17160 

TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 17220 

CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TG AAC CAT AG GTGTTTGCCC 17280 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG 60 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 

TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 24 0 

TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 300 

AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 360 

-AGCTAAGTGT—C-TAGGGATAT-^ TAGGAATAGG-^ TGTGATGAAA-GGAGGGATGA— TAIGAGGAGC 4-20 



TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 480 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 540 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGGTTG 600 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 660 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 720 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 780 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 84 0 

TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 900 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 960 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTCGTG AAAACAGTTG TAATGATTGT 1020 
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ATAAG CG ACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 1140 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 12 00 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 1260 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAG CAATT A CAC CAATT AC 132 0 

TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAG CATTTCT 13 80 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 14 4 0 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AAC CGAACTT 1500 

1S ACTTACTAAT GCAATGrTTC TACCTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 1560 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 1620 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 168 0 

GCGTAATACT GCACTAGCTA TAGGAGCCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 174 0 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 1800 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 18 60 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 1920 

TCCAACGAAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 2 04 0 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTG CTAATT GCGTGAATAC 2100 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 2160 

35 ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCC CATTT 2220 

AAGCACGCTT TGAGACG CTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 2280 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 2 34 0 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 2400 

TTCGTTTGTT GCTGAGCTAG CTTGTAAAGT GC CAT CATTA AGCATCTTTA TAGCGCTGAT 2460 

AG CCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 2520 

AAGTACACCA CCAGTTAACA CTTTG AT AG C GTTTAATAGC GCAAATACTA CAGGTACTAC 2580 

GCTCG CTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 264 0 

SO AGAACCTACA GTACCGAACA CACGGAACAT ATTAGCTAAA TTCCCCATCT GTCTTTGAAA 2700 

ATTGTCATTT G CTTTT ATT A TGTAGGCATA AG CTTTCTTT AAACCATTAG TATCGACATC 2760 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CATAAATAGT 2820 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA GTTAGCAACA CCATTGTCCA CGTCTATAAT 294 0 

AGCTTTGGCT TT AG ACCTAT TTAATGCTTC GAGACTAGCT TTAGATACTT TTAACACTCG 3 00 0 

ATTGAATTTA CTGTTATCTG CATTGACGTC AATATTGACA CGTTTCTTTT CTAATTCTGA 3060 

TAATTTAGCT TCTGTTTCAG CGATATCTTT AAT CAACTTT TGTTTTTGCA ACTTAACTTC 312 0 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

TTGTAAATCT TGTATACTAG CATCTAATTT AGCTTTTACA TTTTTGTTAC T AAAGG CAT C 3 24 0 

TAAAGACTTT TTAG CAACTT TGATAGTTTT TTGTAAATTT TTATCGTTAG CGTTTAATT C 3 30 0 

75 AACATCTTTA GTTTGATCTG CTACTCGTTT AAATCTTTGC ACAGACTTAA CCGCACTATC 3 360 

AATTTGCCTT TTGAATTTGG CTACACTAGC TTCAATAGTC GCTTTAATTT TATATTCCGT 3420 

CACATTAACA CCTCTCTTTC TATTGCTTAT TAAATTCTGC TATAACTTTA AAGAATTCAT 34 80 

TATTTTGTGG TTCGTATTCA TCACGTTCGC TACTAAATCT TATATCTTTA CCTTCGTTAA 354 0 

GCCGTTGGAT ATTTTCTTCA TAAGGCAATA CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3600 

CTTTAGGTTT ATTTTCTGTC CCAACATTTT TAGTAGCTGC AGCATCACGA ATAGCAAACG 366 0 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA GCATTTCATA CTCTTTCGCA TACATTCGAT 3720 

AG TT AT ATT C TGTTAATGTC ATTTGCTCAA TAACGTTCAA ATCTGTAATA CCAAGTGTTG 3780 

ACATACAAGT TATAACGATT CTGTCGTAAG TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 84 0 

TTCCACTACT TCGACTAGGT TTCGGGTCAT AGGTCGCTJT CCCAAcTCCG TT AAAAT AT C 3 90 0 

CGAACCGAAT TCTTCTAGTC CGATATTTTC TGCGATTTCA TCTAATGCTT CAT CAATGTT 3 96 0 

35 ATTAATAGTA ATTGCTTGTT TTTTTAAGTG AGATGTAGCT GCGATTAAAA CTTCGCCAAT 402 0_ 

CACAACCGGA TTTCCACTTT CTAAACCTAC AGGCAACATT GATACACCTT GACCGATAGA 408 0 

AGCTTGTTCA ACTTTTAAAC CTAATCGGTT ATCGATTTCT CTTAAAAATT TAAAACCAAA 414 0 

ACTTAATTCT AATGACTTTC CGTTAATTTC TACATTCATA ACTTAAAATC TCCATTCATA 4200 

ATTAATTTAA ACAAAATAAA mArGCTTAAC GCCCTATTTT TATACCTCTC TTGGTGCAAC 4260 

CGGTGGTGAA TCTACTTTAG GTTGTGGAAT TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4320 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 43 BO 
AGGTAGTGTT GCAAATCCAC GTTGGAAACG AC CATTCACT C CAT ATTCAT ATTCATATTC 4440 
ATCAATACCG TTAGCTTCTG CTTTTAATTC AAATTTATTG TGGAAACCTT GGAAATATTT 4500 
CGCTTTAAAT TTAGCGGAAT CCCCATTTTT GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4560 
TTCATACAAT ACGCGATCTA CAACTGCATC TTCAATTTCA TCTGCAAAAT CGTCACCATA 4620 
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GTCCATTGTA 


TCCTCTGTAT 


CTGTATCAGC 


TTCATGTGAT 


AAG CCGTATT 


CAGTTAAAAA 


4740 


AAGCATTTTA 


GTAGCATCTA 


CTTTTTCGCC 


AGCTTTTCTA 


AATAAAATAA 


TACGATCATT 


4800 


ACTATTTTTC 


ATATTTGCCA 


TTCAATATTC 


CTCCGTTTTT 


TAAAATGTTT 


TGTAAGATAT 


4860 


CGTTACTGAT 


GTGTGTAGCA 


ATTCTTGATT 


GGTAGTATCA 


TCAACTAACT 


GTGTGATGTT 


4920 


AGTATCTTCT 


TCTTCAAAGT 


CATAATCGTT 


TGTTTTAACG 


CTAGGTGTTA 


AATCATCAAT 


4980 


ACATCTTTTA 


ACAAGTCCGT 


CATGATGTCC 


TAAATCATCG 


CTTACACTCC 


AAATATCAAT 


5040 


AACTAAATTC 


GTATCGCCAG 


AATAACTATC 


AAACGTGTAC 


TTACTTCTAT 


TTGACTCCGG 


5100 


CATITTTATT 


ACAAAAAAAG 


GATACGGAAT 


CTCTTGTTGC 


ATCTCTTTAC 


GAGAAATAAC 


5160 


AGGGAATCCA 


TATCCTTGTA 


GCGTTTCATA 


CGCTTTATTA 


TAAAGTTGTA 


AGTTCGGTGT 


5220 


CATGCTTTTA 


TCTCCTATTC 


AAACAACGCT 


TTCAATTCTT 


CTACAGTTGA 


TTTCCTAATC 


5280 


ACTTCGTATA 


CCGGCCACAT 


AAAAGGTTCA 


GCCTCCATGT 


ATCGAGTACC 


AAATT CT AAG 


5340 


AAACCACTAT 


AAGCTGCGTG 


CGATGTGATA 


GTGTATTGCA 


AATCGCCAGT 


TTTTTTATAT 


5400 


CTGATATTGC 


GTGAT aAATT 


ACC 








5423 



25 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6251 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATT CATGAAT GGTTAACTGA 120 

40 

TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 24 0 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAGCAAT 300 

45 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 420 

SO TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 480 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 540 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 720 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 78 0 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 84 0 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 90 0 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 96 0 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 102 0 

TTCTG AG CAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 108 0 

15 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 1140 

GGTGATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 12 00 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 12 60 

ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 1320 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 13 80 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 14 4 0 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT G CCAATTT AG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 16 2 0 

TGACTAATTG CCATG CTGAT TTGTCAATTT GAGTGCAACa CTT CGTTAAT TGAGTGATAT 16 80 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TT CG ACT AAA 174 0 

35 TAATAGGTAA-ATATTACAGT— TATTTGTTGA-GTCGGTTAAA-TAGAAAGTGT— TATGATATGT 180.0 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 1860 

ATAWVCGTAG AGAAGCAATC AGACAACAAA TTG AT AG CAA TCCCTTCATC ACAGACCATG 1920 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 204 0 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

CGCAATCAAT TTTAGATATT ACATCGGATT CTGTTTTTCA TAAAACTGGA ATTGCG CGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 2220 

50 TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 2280 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2340 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT 2400 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2520 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2 580 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATAG CTCAATGGTA 2640 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA G CAG ATGGAT GTGTGTCAGC AGGTAATACT 2700 

GGTG CTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

GCTTTAGTAG TAACATTG CC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AG ACGTTGG T 2 820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2 8 80 

15 GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2940 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3 000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3 060 

GTT AC CGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATT ACT CAG A ATACGGTGGT 3 24 0 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 33 00 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 

ATGAAAGAGA CTGTAGGTGA At CAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 34 20 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 34 80 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 354 0 

35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATTATTAGCA GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 36 60 

ATATPCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 3780 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTATCAT CTGATGACAA 384 0 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3 900 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 396 0 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4 020 

50 TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4080 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TAT ATTCAC C 414 0 

AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4200 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4 320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC GTAGTATTGC GTTACAATTA GCAGAAGAAG 4 380 

GATATAATGT AGCAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAG C A GTAGTCGAAG 4440 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4 500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 560 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAACAA GAGTGGGATG 4 6 20 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4 680 

TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4 74 0 

ATCCGGGACA AG CAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4 800 

CGGCGCGTGA ATTAGCATCT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 86 0 

TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 920 

CGTTAGCACG TTTTGGTCAA GACACAGATA TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4 980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT 5 04 0 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAT aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT GTATAGGCAT TTTTAAAGGA GGTGAATCGA 5280 

CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA 534 0 

3 5 T AAAGTAACT GAAGATGCAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC 5_40_0_ 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 5460 

AAAAfiTCAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA AATAATAAAT 5520 

CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5530 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 564 0 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCGCTT TGATACTAAA ATGACAGAGT 5700 

TAGGCTTTAC TTATCAAAAT ATTGATTTAT ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

50 CGGTATTAGA ATTGACGGTT T CACGAT ATT TATTTGATAa ACATCCCAAC TTGCCAGAAG 5880 

GGAATTTAAC AAAAATGCGT G CCa CTATTG TATGTGAGCC CtCACTkGTA ATATTTGCGA 594 0 

ATAAAATTGG ATTGAACGAA ATGATTTTAC TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6 000 
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20 



25 



30 



40 



45 



ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 6120 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 6180 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 624 0 

TATTCACTTC A 62 51 
(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
is (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ACCTACTGAA GTTGCTAATT TTTTGGAGCA ACTAAGCACT GAAATTGAAC GTCTTAAAGA 6 0 

AGATAAAAAA CAACTTGAAA AAGTAATCGA AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

AGACGTGgCA TCAATCTGTA AGTGaTGCTT TG AT ACAAG C TCAAAAAGCT GGTGAAGAAA 180 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG CGATTATAGC TAAGGCAGAA GCGCAAgcTA 24 0 

ATcAAATGGT TGGTGACGCG GTAGAAAAAG CACGCCGTTT AGCATTCCAG ACTGAAGATA 300 

TGAAACGTCA ATCAAAAGTA TTTAGATCGC GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 3 60 

ACTTATTAAA AAACGAAGAT TGGGATTACT TGTTGAATTA TGATTTAGAC GCTGAACAAG 4 20 

TGACGCTTGA AAATATTCAT CATTTGCATG AAAATGATTT AAAGCCAGAT GAAGTTGCAG 4 80 

35 CAAATGCACA AAATAATGCA TCAAATACAC CAGACAATAA TCAACAATCC AATGATTCAG 54 0 

AAACAACTAA GAAGTAAGAA TTAAATAAAG ACAGACGCGT AATATACATT TAACTTTTCA 600 

CAGCGAATTA GGTAATGGTG AGAGCCTAGT AAAAGCATGT ATGTTATATC ACTGGCTTTT 660 

TAATATTTAA ATAATGTAAT GAGAGAACTC TAAGTTGAGT TAATAAGGGT GGTAC CGCGA 720 

GCAATCGTGC CTTTTAATTT AACTTAGAGT' TTTTTAAATT TTTAAGGAGT GAAAAAAATG 7 80 

GATTACAAAG AAACGTTATT AATGCCTAAA ACAGATTTCC CAATGCGAGG TGGTTTACCA 84 0 

AACAAGGAAC CGCAAATTCA AGAAAAATGG GATGCAGAAG ATCAATACCA TAAAGCGTTA 900 

GAAAAAAATA AAGGTAACGA AACATTCATT TTACATGATG GCCCACCATA CGCGAATGGT 96 0 

SO AACTT A CAT A TGGGACATGC CTTGAACAAA ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

ACT ATG CAAG GGTTCTATGC ACCATACGTA CCAGGTTGGG ATACACATGG TTTACCAATT 1080 

GAACAAG CAT TAACGAAAAA AGGTGTTGAC CGAAAGAAAA TGTCAACAGC TGAATTCCGT 1140 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 126 0 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 1320 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 1380 

GATAAACGTT CAG CATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATG TAAATGG cGAAAAATAT 1560 

ATTATTGCAG AAGC CTTGT C TGACGCTGTA GCAGAAGCAC TGGaTTGGGA TAAAGCATCA 1620 

15 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 174 0 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CAT AG CTATC CACACGACTG GAGAACAAAA 1980 

AAACCTGTAA TCTTCCGTGC TACAC CACAA TGGTTTG CCT CAATCAGTAA AGTAAGACAA 2040 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 216 0 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 2220 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 22 80 

?1 <aCTrACTAC_CAGAAGGATT_TACACAT.CCA_G^ 2340 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGG CGT GTTGGAAACA 2400 

AGACCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 2460 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 2640 

GTAAGTAGTA CGGACTATTT AG CTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2700 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2760 

SO TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2820 

CTAAATCGTT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2 940 
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CAAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AATCTTAGTG 3060 

CAT ACAG CTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 324 0 

GGTAAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 3 3 00 

TTCTTAACTT CATTTGATGC ATTACATCAA TTATTTATCG TGTCACAAGT TAAAGTTGTA 3 3 60 

GATAAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 34 20 

15 GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 34 80 

TTGACG CATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 3 54 0 

TATAAAGTAC TCATACAGAT GATATAAATT AAAGCTCTCT TCATAATCAT GTTGTAGTTT 3600 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACCAAGT TACCGGTCAT 3660 

ATATGTAAAA AATGTGCGAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC 3720 

TG AGG CAT AT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 37 80 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 3 84 0 

AAACTTTGTC AATGGTATCA CTTTAAATGT GAGAGATAAG AATGAATTAA AG C CATTTT A 3 900 

30 TGAGGACATA TTAGGATTAA ATATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3 96 0 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4020 

GTCCGAAGCG GGACTGTTTC ATATCGCAAT TAAACTACCT CAAATTAGTG ATTTAGCTAA 4080 

TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 414 0 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4200 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 4260 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 4320 

TATAGGTGCA TTG CATATTA AGACAATTCA TTTATCAGAG GTAAAAGAGT ACT AC CTCGA 4380 

TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 444 0 

TGGCTATTAT CAACATTTGG CCATGAATGA TTGGGTATCA GCAACGAAAC GTGTAGAAAA 4 500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4 560 

so TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4 620 

TATA C TT TG A ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 4680 

ATGAAGGAGG CTGGGACATT AAGTTCTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 4740 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AAC CTAGCTC TCGTTTAACT TTATTTATTC 4860 
CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4 920 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: double 
(D) TOPOLOGY: linear 



75 (Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 6 0 

CGCTTTTGCA AGT AAAC CAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 12 0 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 240 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 420 

ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 4 80 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 54 0 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 600 

55__jrCACTAGAGG„AACGCGTACA_TCGTTT 626- 

(2) INFORMATION FOR SEQ ID NO: 28: 



T(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 

50 AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 120 

GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAG CTGTAGC 180 

TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 240 
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AACCTTAGCC AAGACGTTGA ATGTAC CATT TGCAATTGCA GATGCGACAA GTTTAACTGA 36 0 

AG CTGGTT AT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAG CTGA 420 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG AT AAAATTG C 480 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 54 0 

w ATTGCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCCG CCACAAGGTG GACGCAAACA 600 

TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC 66 0 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

15 AAGCAATGAA GCTGATAAAT ATGACGAACA AG CATT ATT A GCACAAATTC GCCCAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT GTGCCAATTG TAG CT AATTT 84 0 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

20 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 960 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 1020 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 1080 

25 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 112 6 
(2) INFORMATION FOR SEQ ID NO: 29: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : 1 inear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 

40 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 

GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 

4S AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 240 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 3 00 

GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 

SO AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 4 20 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 540 
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CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCATT 660 

ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTGTAGGTCT 720 

AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAT ATCGGCTTTG AAGTCGTTGA 780 

TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC 84 0 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 

GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 960 

TGTCTTATAT GATGATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 

'5 TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 10 80 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 1140 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 12 00 

CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 12 60 

TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 1320 

CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCCC GTCAGCTAAA 1380 

GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTACCAAT 144 0 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA 1500 

TTGTATAAGT ATCCTGTGCA AGATATTATG AC CGAAGAGT TTTCAACACA AAGCCCTCAA 1560 

CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 1620 

ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATGTG GCAAAATCAC 1680 

35 CAAATGAeAT-TGCGTGTTGG-TGAGGATATC-GATGTGGAC^ 1-740 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGCATATTG GTGAATTCTC ATTGCGAGGA 1800 

GGTATTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 1860 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 1920 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 1980 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 2040 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAATTATTCG AAAGTACATA CTTTGATCAT 2100 

CAAATACTAC GTCGCTTAGT AG CGTTT ATG TATGAAACAC CTTCGACAAT TATTGAGTAT 2160 

SO TTCCAAAAAG ATGCAATCAT TG CAGTTG AT GAATTTAATC GTATTAAAGA AACTGAAGAA 2220 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AG CAAT ATT A TTGAAAGTGG TAATGGATTT 2280 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 2340 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG CAATATGACA TTATGCGTTC TGAATTTCAA 2460 

CGATATGTTA ATCAAAACTA TCATATCGTG GTTTTGGTCG AAACCGAAAC TAAAGTTGAA 2520 

CGTATGCAAG CGATGTTAAG TGAAAtGCAT ATTCCATCAA TAACAAAATT GCATCGCTCA 2580 

ATGTCATCGG GGCAAGCAGT GATTATTGAA GGCAGTTTAT CTGAAGGATT TGAACTACCT 264 0 

GATATGGGAT TAGTTGTCAT TACTGAGCGT GAg cTTTTTA AATCAAAACA GAAAAAGCAA 27 00 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 2760 

GTGGGAGATT ATATTGTTCA TGTGCATCAT GGTGTTGGTA GATATTTAGG TGTTGAGACG 2 820 

15 CTCGAAGTGG GGCAAACGCA TCGTGATTAT ATTAAATTGC AATATAAAGG TACGGATCAA 2 8 80 

CTATTTGTTC CAGTAGATCA AATGGATCAA GTTCAAAAAT ATGTAG CTTC GGAAGATAAG 2940 

ACGCCAAAAT TAAATAAACT CGGTGGCAGT GAATGGAAAA AAACAAAAGC TAAAGTTCAA 30 00 

CAAAGTGTTG AAGATATTGC TGAAGAGTTG ATTGATTTAT ATAAAGAAAG AGAAATGGCA 3 060 

GAAGGTTATC AATATGGGGA AGACACAGCT GAGCAAACAA CATTTGAATT AGATTTTCCA 3120 

TATGAACTTA CGCCTGACCA AGCTAAATCT ATCGATGAAA TTAAAGATGA CATGCAAAAA 3180 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT GATGTTGGTT ATGGTAAAAC TGAAGTTGCA 324 0 

GT GAG AG CAG CATTCAAAGC TGTAATGGAA GG AAAGCAGG TTGCATTTTT AGTT CCTACA 3 3 00 

ACTATTTTAG CTCAGCAACA TTATGAGACG TTAATTGAGC GTATGCAAGA TTTTCCTGTT 33 60 

GAAATTCAAT TAATGAGTCG TTTTAGAACG CCTAAAGAGA TAAAACAAAC TAAGGAAGGA 34 2 0 

CTTAAAACTG GATTTGTTGA CATAGTTGTT GGTACACACA AATTACTTAG TAAAGATATA 34 80 

35 CAGTATAAAG ATTTAGGGCT GTTGATTGTA GATGAAGAAC AACGATTTGG TGTACGCCAT 3540 

AAAGAGCGTA TTAAAACATT AAAACATAAT GTAGATGTAC TAACATTGAC TGCAACCCCA 3600 

ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGT GAGT GATTGAAACG 3660 

CCGCCAGAAA ATCGTTTCCC AGTTCAAACA TATGTATTAG AACAGAACAT GAGTTTTATC 3720 

AAAGAAGCTT TAGAAAGAGA ACTATCCCGT GATGGCCAAG TGTTTTATCT TTATAATAAA 3780 

GTGCAATCCA TTTATGaAAA ACGAGAACAA CTCCAGATGT TAATG CCAG A TGCTAACATT 3840 

GCAGTTGCTC ATGGACAAAT GACAGAGCGC GATTTAGAAG AAACGATGTT AAGTTTT AT C 3 900 

AATAATgAAT ATGATATTTT AGTAACGACG ACGATTATTG AAACAGGTGT CGATGTCCCA 3 960 

SO AATGCAAATA CTTTGATCAT TGAAGATGCA GATCGCTTTG GATTGAGTCA GTTGTAT CAA 4 020 

TTAAGAGGTC GTGTTGGTCG TTCAAGTCGT ATTGGTTATG CATACTTCTT ACATCCAGCA 4 080 

AATAAGGTAC TAACTGAGAC TGCAGAAGAT CGATTACAAG CGATTAAAGA ATTTACGGAG 414 0 
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TTAGGTAAAC AACAG CACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 4260 

TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 4 320 

GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 4 380 

GCTAAAATTG AA 4 3 92 

(2) INFORMATION FOR SEQ ID NO : 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
75 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 6 0 

GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 12 0 

TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 18 0 

TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 24 0 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 300 

CATTAAG CG C TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATACCAT 3 60 

TTGG CCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 42 0 

TTGCAGTTTG AT AA C CG ATT TCTAAATTTT TTACATG CAT GACGTCATTA CCTGTATTCC 48 0 

35 GGT CAAAGCC AAATTGAA T A TTT G CACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 54 0 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

40 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTC CTCGTAA TT AC CAACAT 720 

AGCGTTTGA 72 9 
4s (2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
so (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT GGATATACTT TAAAGGTTGT GTCGTATGTT 120 

TCCTTACTAT CTTTAGCTTC AGATTCCTGT GATTCAACCG TTTTATATTT TTCAAGTGCA 180 

TGTCCTTCAA TATCAACTCG TGGAATAATG CGATTCAACC ATGCTGGTAA ATACCACGAA 240 

CCTTTtCCAA ACAATTTCGt TAATGCAGGA ATTAACATCA TtCTGACTAC GAAGGCATCA 3 00 

AAGAGTACAC CAAACGCTAA TGCCATACCC ATTGATTTAA TCATGACATC TTCTTGGAAT 3 60 

ACAAACGCAA AGAAGACACT AAACATAATT AATGCAGCTG CTACAATAAC AGGACCGCTT 420 

TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 480 

,5 ATTCGCGACA TAAGGAAGAC TT CATAAT C C ATCG CTAATC CAAATAAGAT ACCTATAGTA 54 0 

ATAACCGGTA AAAATG CT AG CATTGGTCCT GTCGTTTCAA TACCAAACAG ACCTTTCATA 600 

AAACCATCTT GCATTACTAA TGTTGTAAAT CCTAATGTTG CCATTAATGA CAAGACGAAT 66 0 

CCTAAAACTG CTTTTAATGG TATTAGAATT GAACGGAAGA CAATCATTAA TAAGAAAAAT 720 

GCTAATACAA CAATGACTGA GG CAAATAAA GGTATCGCCT CATTTAACTT TTTAGACATA 78 0 

TCAATATTAA TGACACTTTG TCCCGAAATC TCCGTTTTGA ACCCATATTT ATCTTGTGCA 84 0 

TCTTTATGAT AATCTCGTAA ATCATGCACT AAATCATTTG TACTCTCTGC ATTAGG CCCT 900 

TGCTTAGGTA TCACGACCAT CAAAGCGTAA TCATTATCTT TACTCATTTG TGGTGG CGTA 960 

ACGATATCTA CATTTTTCTT AT CTTTAAT A TCTTTATATA CAGACTGTAA ATCTTGTTGT 1020 

AATCCTTGTG GATCATCCTT TTTATCTTTC ACATTTATCA ACATCGGTAT TTGGCCATTA 1080 

AATCCTTCAC CAAATTTATC CGAGATAATA TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 114 0 

35 GGTTTAACAC CGTCATCTGG AATACCAAGT CGCATATGAC TAACTGGTAT TGCAGCTGCT 1200 

ACTAATATGA TTAAACCTAG TAATACTGCC GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 1260 

CATGGCGTAT CAATATCTTT TTTGAATTTA GACTGTAATT TATTCACTTT AATGCGTTtA 1320 

TGGAAAATGC TTATTAATGC AGGTAATAAA GTTAAAGCGC TAAGTACTGC AAAAACAACA 138 0 

CTAATTGCCG AAGCAAATCC CATTACCGCT AAGAAGTCAA TGCCTACTAA TGATAAACCA 14 4 0 

CATACTGCAA TTACAACTGT TACACCAGCA AAAACAACTG CACTACCTGC TGTTCCTATT 1500 

GCAAGACCAA TGCCTTTAAT GTAATCTGTT TCAGTTTTCA TAACTTGTCG ATATCTGAAT 156 0 

AAAATAAATA ATGCATAATC GATACCAACT G CT AGTCCAA TCATTACGGC TAATGTCAGT 1620 

£0 GTGACATTTG GTATATCGAA TG CATAAGTT AACAAACTGA TAATACCTAC ACCAGAGGCT 1680 

AGACCAATCA ATGCACTTAT AATTGGTAAT CCTGCAGCAA TGACTGAACC GAATGTGATT 174 0 

AACAGTACAA CAAATGCAAC AATAATACCA ACTAGTTCAG AATTACCGCC TACTTCTGTA 1800 
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AAATGACTTT TAACATTATC TCTAGAGCCA TCTTTTAAAG ATGTTTGACT AACGTCATAT 1920 

GTGATATCTG CAAATGCAGT TGTTTTATCT TTACTAATTT GCTTATTTTC ATAAGGATCT I960 

GATATTTTAT CAATGTGCTT GTCATCTTTT TTAATATCAT CTAACGTTTT CTTAATATCT 2040 

TTAGTAATGT TCGGTTGCAC AATACCATCA TCTTTAGTCG TCTTAAAGAC AACACGTATT 2100 

TGTGCCTTTT CACTATCTTG ATTAAAATGT TTTTCAATCT TTTTATTCGT ATCTAACGAC 2160 

TCTAATCCTG TCATTTTAAT ATCATTGTCA AATTTCGGTG CATTTGT AG C AAGTGGTATC 2220 

AATATTGCAG CTACAATCAC TATCCATGCA ATGACCGCGG ACCATTTATG TTTTGCGATG 2280 

15 AATGTCCCCA TCTTATATAA AAATTTTGCC AAAGTATATT GCCTCCTTTT AAAATCAACG 2340 

TTATAGTTTA AATATACAGT GTAGATTATT GTTCGATTAT AGTATCTATC CCCGACCTCT 2400 

TAAAGAATCA ATTGGAAAAT TTTGTATATT AAACTACACA CAAAGGAGAA ATGTAGATGA 2460 

AAGAGACTGA TTTACGAGTT ATAAAGACAA AAAAAGCATT GTCGAGTAGC TTGCTACAAT 2520 

TGTTAGAACA GCAATTATTC CAAACGATTA CTGTCAATCA AATTTGCGAC AACGCACTCG 2 58 0 

TACACCGTAC AACATTTTAT AAACATTTTT ATGATAAATA TGATCTTCTA GAGTACTTGT 2640 

TCAATCAATT GACTAAAGAC TACTTTGCTA GAGATATCAG TGACCGTCTT AAT CATC CAT 2700 

TCCAAACGAT GAGTGATACG ATTAATAATA AAGAGGATTT GAGAGAAATC GCAGAATTCC 2760 

AAGAAGAAGA CGCTGAATTT AATAAAGTAT TAAAAAATGT CTGCATTAAA ATT ATG CAT A 2820 

ACGATATCAA AAATAATAGA GACCGTATCG ATATTGACAG CGACATCCCA GATAATCTCA 2880 

TATTTTATAT TT ATG ACT CG TTGATTGAAG GTTTTATACA TTGGATAAAA GATGAAAAAA 2940 

£5__TTGATTGGCC_TGG^ 3JCL0.0 

AATAGTAGAT GAGAAACTCA TGAGCGTTAC CAACATTCAT AATAAAAACG ATAGTGkACA 3060 

CGTTAATGAA TTCGTGTACT ACT AT CGTTT TTTATTTTTA TCGTGCTTAT CGCTATTAAA 3120 

ACAACTGATA CACAACACAT AAACTATGAA GAAAAAAATA AATCCGCTAT CTAAATGACT 3180 

TTGACTCAGT TGTTTAAATG ACCAAATTGC TAATACAATT CCCATTATTA TTGAAATAAC 324 0 

GTATCTCACA TTCTTAT AC C TATAATCCTT TTCTAAAAAT ATGGTTGCTA TTACTTAATT 3300 

TTTAAAGTTA TAAATAAAAA GAGCCAACCG CAATGGATGG CCCTTGTTCA TTATGAAGCA 3360 

TTAGAACATT TCTGAAACAA CCTTTTGTTC TAAGAAGTGT AATAAGTAGT CTGGACTACC 34 20 

SO TGTTTTAGCG TCCGTACCTG ACATTTTGAA ACCACCAAAT GGATGGTATC CAACAACTGC 34 80 

TGAAGTACAG CCTCTGTTAA GGTATAAATT GCCTACATCA AATTCGTTTA CCGCTTTAAT 3540 
CCAATGCTCG CGATTATTTG TAATCACTGC ACCAGTTAAA CCGTAATCTG TATCATTTGC 3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT 
GTAACCTTTT GAATCATCAG TGCCGCCACC 
5 CTCAATATAA TTTTTAATCT TATCAAATTG 

ATTGTCTACA GTATTGCCCA ACGTTAATTC 
TTCGTCATAA ACGTCTTTAT GCACAATTGC 

70 

AAAACCAAAT GCTGACGTTA CAATAGCTTC 
AACTACAATG GCATCTTTAC CACCCATTTC 

1S TTCTTGAACA ACGGCACTAC GTTCATAAAT 

TGTAACGAAA TGCGTATCTT TATGATCAAC 
AGGAACAAAG TTAACTACGC CTTTTGGTAA 

20 ATAAGCGATA TAAGGTGTAT CCTCAGCAGG 

TGGTGCTAAA GTTGTACCAG CCATAATCGC 
ACCTGTACCA ATTGATTTAT AGAAATATTT 

25 

CTT AC CTTG A GCCAAGTCCA TCATTGAACG 
AGCTGCATCA CCAACTGCTT CATCCCATGG 
AATTTCCGCT TTTCGACGAC GAATAATTGC 

30 

ATTTGCTGAC CATGTTTTCC AAGATTTATA 
AACATCTTGT TTTGTTG CCT TTGATGCATT 
35 GATTGATTTA ATTTTGTCAT CTTTGAAAAT 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA 
GACTGAAAAA TCGTAACCAG GTTCATTTTT 

40 

ATAAATTTTG AAAGTGGTTT AACCCTTTGA 
TTACTATGAT TAAGGTTAGT TTTGCAATCG 
CAAGTATTTT GAAATTGATT GGTTACTTTT 

45 

TATCGTTTCG TCATTTAATG TTTCGGATGG 
ACAAGGGTTT CCAACCGCTA AGCTGTGTGG 
50 ACCAATCACA CTGCCTTCTC CAATCGTCAC 

CCAAGTATTA CTGCCAATAT GAATGGGTCC 
GAAATTAAGT GGATGTGTCG CTGTGTAGAA 

55 



AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3 720 

TTGTTCTAAT TTACCTTCTT CTTTACCAAT 3780 

TTTTTTATTA ATAACTGGGC CCATATACGT 3 840 

TTTTGTTAAT TTGATTGATT TCTCTAATAC 3 900 

ACGTGAACAT GCTGAACATT TTTGACCAGA 3 960 

TGCTGCCATA TCTGTATCAA TATTTTCATC 4 020 

AGCGATAACA CGTTTCAAGA AGTTTTGACC 4 080 

TCTAGTACCT GTCGCACGTG ATCCTGTAAA 4140 

TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

TCCTGCTTCT TCTAAAATTT CCATTAATTT 42 60 

TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 

AAACGGGAAG TTCCACGGCG GAATTGTAAC 43 80 

ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

TGCATAGTAT TCAATAAAAT CAATAC CTTC 45 00 

CTTACCTGCT TCATAAACCA TAATTG CTGC 4560 

CGAAACACGT AACATAAGCT CTG CACGATC 46 20 

AGCTTCGTTT GCTGCTTTAA ACG CAT CTTC 46 80 

TGCAATCACT TGTGATGTGT CTGCAGGATT 4740 

CTTCTCTCCA TTAATCACTA ATGGTATGTC 48 00 

TGCTTTCTTA AACATATCCA CATTTTCTTG 48 60 

AAATTCTACT ACCATGTACA CTTACCCCCT 4920 

TTTAATGATA TAACATCATT TAAACTCATT 4980 

CTTTCATTTT TATGTTTTAT CACTTATTCT 5040 

TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TAGGTCATTA TCAATTTTAC GAACGACTTT 516 0 

CGGAATATCT TTAGTGACAA CACTACCAGC 5220 

CCCTGGTAAC ACGGCTACAT GACCGCCAAA 528 0 

GGCTTTTTCA AAACCTTCAT TTCTATGATG 5340 

T C CACAATTA GGTCCTATAA AAACATTATC 5400 



BNSOOCtO: <£F 078651 9 A2_L> 



310 



10 



20 



25 



EP 0 786 519 A2 

TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA ATCGAAATAC TTACATTGTC 5520 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 5580 

TGTATGATTT AATTCAAAGC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 564 0 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 57 00 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 5760 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCG CATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

15 AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 5940 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

CATC CAT CTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 6120 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 6240 

ATGTTTG CAC GGCAATCTCT CTTTTTCTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 6300 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 6420 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 64 80 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

35 AATACGATTC AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 6600 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTCAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 6720 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 6780 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 684 0 

AAGCTTATCA TATCAG CAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6900 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6960 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7020 

50 GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 7080 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 

ACAAGCGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 
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TTAAAAGATG 


GTAATGAAGT 


GATGATTCCT 


CTAAATGAAG 


TACATGTTGG 


AGATACACTT 


7320 


ATCGTTAAAC 


CAGGTGAAAA 


GATACCTGTT 


GATGGCAAAA 


TTATTAAAGG 


TATGACTGCC 


7380 


ATCGACGAAT 


CTATGTTAAC 


AGGTGAATCT 


ATCCCTGTTG 


AGAAGAATGT 


TGATGATACT 


7440 


GTAATTGGTT 


CAACGATGAA 


CAAAAACGGT 


ACTATTACTA 


TGACAGCAAC 


AAAAGTTGGC 


7500 


GGGGACACTG 


CGTTGGCAAA 


TATTATTAAA 


GTTGTCGAAG 


AAGCTCAAAG 


TTCTAAAGCG 


7560 


CCGATTCAAC 


GATTGGCAGA 


TATTATTTCT 


GGTTATTTCG 


TTCCTATCGT 


TGTTGGTATC 


7620 


GCACTATTAA 


CATTTATCGT 


GTGGATTACT 


TTAGTTACAC 


CAGGTACATT 


TGAACCTGCA 


7680 


CTTGTTGCGA 


GTATTTCCGT 


TCTCGTCATT 


GCTTGTCCAT 


GCGCATTGGG 


ACTTGCTACA 


7740 


CCAACTTCTA 


TTATGGTAGG 


TACTGGTCGC 


GCTGCTGaAA 


ATGGTATTTT 


ATTTAAAGGT 


7800 


GGCGAGTTTG 


TTGAACGCAC 


ACATCAAATT 


GATACCATCG 


TTTTAGATAA 


GACGGGTACC 


7860 


ATTACAAATG 


GTCGTCCAGT 


CGTGACAGAT 


TATCATGGTG 


ACAATCAAAC 


GCTACAACTA 


7920 


CTTGCTACTG 


CTGAAAAAGA 


TTCTGAACAC 


CCATTGGCAG 


AAGCCATTGT 


CAATTATGCA 


7980 


AAAGAAAAGC 


AATTAATATT 


AACTGAGACA 


ACAACATTTA 


AAGCAGTAC C 


TGGC CATGGT 


8040 


ATTGAAGCAA 


CGATTGATCA 


TCACCATATA 


TTGGTTGGTA 


ACCGTAAATT 


AATGGCTGAC 


8100 


AATGATATTA 


GCTTGCCTAA 


GCATATTTCT 


GATGATTTAA 


CACATTATGA 


ACGAGATGGT 


8160 


AAAACTGCTA 


TGCTCATTGC 


TGTTAATTAT 


TCATTAACTG 


GTATCATCGC 


AGTGGCAGAT 


8220 


ACTGTCAAAG 


ATCATGCCAA 


AGATGCTATA 


AAACAATTGC 


ATGATATGGG 


CATTGAAGTT 


8280 


GCCATGTTAA 


CTGGCGATAA 


TAAAAACACT 


GCTCAAGCCA 


TTGCAAAACA 


AGTAGGCATA 


8340 


GATACTGTTA 


TTGCAGATAT 


TTTACCAGAA 


GAAAAAGCTG 


CACAAATTGC 


GAAACTACAG 


8400 


CAACAAGGTA 


AGAAGGTTGC 


GATGGTTGGT 


GACGGTGTAA 


ATGATGCACC 


TGCATTAGTT 


8460 


AAAGCTGATA 


TCGGTATCGC 


CATTGGTACA 


GGTACAGAAG 


TTGCCATTGA 


AGCAGCTGAT 


8520 


ATTACTATTC 


TTGGTGGCGA 


CTTGATGCTT 


ATTCCTAAAG 


CCATTTATGC 


AAGTAAAGCA 


8580 


ACCATTGGTA 


ATATTCGTCA 


AAATCTATTT 


' TGGGCATTCG 


GCTATAATAT 


tgccggtatc 


8640 


CCTATAGCTG 


CATTGGGCTT 


ACTTGCGCCA 


TGGGTTGCTG 


GTGCTGCAAT 


GGCACTAAGT 


8700 


TCAGTAAGTG 


TTGTCACAAA 


CGCACTTAGA 


TTGAAAAAGA 


TGCGATTAGA 


ACCACGCCGT 


8760 


AAAGATGCCT 


AGATTCCTTA 


ATAATGAAGG 


ATTCGTTGGT 


GATTCTGAGA 


TAGGCTAGTG 


8820 


ATTGGCTCTA 


TAATGTCGCG 


GTTTAy aGT C 


GGATCTTCGC 


TCCAACTGCA 


TATATAGTnA 


8880 


CACTTTTCGC 


TTGGCGAATT 


AGTGTATCTT 


ACCTAATAGc 


TCCGCCTATT 


AGGTTCCATC 


8940 


ATTATTATAA 


ATAATAAGTA 


CACTACGGtT 


TACAGTTGGA 


TCTTCGCTCC 


AACTGCATAA 


9000 
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GAAATTTTAA 


ATGTTGAAGG 


TATGAGCTGT 


GGTCACTGCA 


AAAGTGCTGT 


TGAATCTGCA 


9120 


TTAAATAATA 


TTGACGGTGT 


CACTTCAGCT 


GACGTTAACC 


TTGAAAATGG 


TCAAGTAAGT 


9180 


GTTCAATATG 


ATGACAGTAA 


AGTTGCTGTA 


TCTCAAATGA 


AAGACGCAAT 


TGAAGATCAA 


9240 


GGTTACGATG 


TCGTTTAATT 


AGGCAATATT 


CAACGTCATC 


AACACCAAAT 


TAAAAAATCG 


9300 


AACTGATGAG 


AATCCCAACA 


ATCCAAATTA 


TCTCATCAGT 


TCGATTTTTA 


ATTTACTCGT 


9360 


AACCTAGTAT 


CTCCAGTCTG 


CAATACATCT 


AATGTTG CAT 


CTAATGCATC 


GACAATTAGA 


9420 


TTTTTAACTG 


CAG CTTCAGT 


ATAAAACGCA 


ATATGTGGTG 


TTAATATGAC 


ATCTTCCCTG 


9480 


TCAATCAACG 


ATTCTAACAA 


TGGATCGTTC 


AGTGTTTTGC 


CCCTTTGATC 


ACTTGGGAAA 


9540 


AGTTTGCGTT 


CAAATTCATA 


CGTATCAAGT 


GCTGCACCTT 


TAATCACACC 


ATTGTCTAAT 


9600 


GCGTCTAATA 


ACG CCTTAGT 


ATCTACTAAA 


GAACCTCTCG 


CACAATTGAC 


AAATACTGCG 


9660 


cccrrriTTAA 


AATGTTTAAA 


TAATTCAGCA 


TTAAATAGAT 


AATGATTATA 


TTT CGTTGCA 


9720 


GGTACATGTA 


ATGTCACGAT 


ATCAGCACCT 


TCAACCGCTT 


CCTCAATCGT 


ATCTTTGTAA 


9780 


TCGACATACG 


TTGCAATTTT 


AGCATTAGGA 


AACGGt CGTA 


TGCGACCACA 


TCACTTTGAT 


9840 


AACCATTGGC 


AAATATATCG 


GCTACTACAC 


GGCCAATTCG 


ACCTGTACCA 


ATAACAGCTA 


9900 


CTTTTAAATC 


TTTAATGGAT 


TTCGATAAAA 


TAGTAGGTTC 


CCATCTAAAA 


TCATGcTCCC 


9960 


GCACTTTCGT 


TTGAATTTGA 


TTAAAATGAC 


GAACCACATT 


AAT AG C CTGG 


TTCACAGCAA 


10020 


ACTCCGCAAT 


TGAATTCGGA 


GAGTATGACG 


GCACATTTGA 


CACAATAAAG 


TTATACTTGT 


10080 


TTGCTAACTC 


CAAATCATAT 


GTATCAAATC 


CAGCACTACG 


TTGTGCGATT 


TGTTTAATAC 


10140 


CTAGTTCATT 


-TAATCGTTTA- 


-TAAACATGCT- 


-CTGATAATGG- 


TATTTGTTGT GATAGCGATA 


1020.0 


AGCCATCATA ACCAGCGACA 


CCTTCAACAT 


TGTCATCAGT 


TAATGCTTCT 


TTAGTAATAT 


10260 


CTACCTCAAC 


ATGATGTTTC 


TCTGCCCACG 


CCTTGATATA 


AGGCATATCT 


TCATCACGTA 


10320 


CACTCATGAT 


TTTAATTTTT 


GTCATTTTAA 


CATCACCCTT 


AACTTTATTA 


TTCATATAAA 


10380 


TATGCTAGTT 


CTGTTAATCT 


TATTGCAGCT 


TCGTCTAATT 


TCTGGTCATC 


TAACGCCAAT 


10440 


GAAATTCTCA 


CATAACGATT 


ACCATTCTCT 


CCAAATGGTT 


TCCCTGGAGC 


AACAAGTATT 


10500 


GACTTCTCTT 


GCACTAAAAA 


TTGCTCAAAT 


TGCTCGCTGT 


CATAACCAGG 


CGGTGTTTCC 


10560 


AACCATACAT 


AT ATGC CACC 


TTTAGCATGA 


ACAAATGGCA 


AATCAGCTTT 


TGCAAGCATG 


10620 


GCTTCGAATC 


GGTCACGACG 


TGTTTTAAAT 


ACATTGCTTT 


GTTCTTCTAA 


AAAATCATCA 


10680 


TAATGATTCA 


AAGCATATAT 


TGCGGCATCT 


TGTAATGCAC 


CAAACATCCC 


AGCATTTGTG 


10740 


TGCGTTTGGT 


AcrrriTCAA 


AGCTTGAATC 


ATATCTTTAT 


TACCAACTGC 


AAAACCGACT 


10800 
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C CATTTTC CG 


AAGCAAGTAT 


ACTAGGATTT 


TTAGCGTCGA 


AACCGAAAGC 


ACCATAAGCA 


10920 


AAATCATGCA 


CGATTTTAGT 


GTCTGTACCT 


TTAAATTTAG 


CTATCGCTTC 


ATCAAAAACT 


10980 


TCTTTCGTAG 


CTGTCGATCC 


AGTTGGATTA 


TTTGGATACG 


TTAAATAAAT 


GAGTTTTGTT 


11040 


TTATCTATTA 


TTTGTGAATC 


AACTTTGGAC 


CAATCTGGCA 


AATAATGTGG 


CGGTTCTAAA 


11100 


TTAAG CGGG A 


CTGGCTTGCC 


AT CAG CT AAA 


AGTACAC CTG 


CTAAATAATC 


CGTGTAGCCT 


11160 


GGATCAGGTA 


GTAATACATA 


GTCTCCTGGA 


TTGATAACAC 


ATGTTGGTAC 


TGCCACTAAT 


11220 


C CATTTTTTG 


T AC CATATAA 


AATGCATACT 


TCATCTTCTT 


TATCTAACGT 


CACATTATAT 


11280 


TGTCTTTGAT 


AAAAATCTAC 


AATAG CTTGC 


TTGAACGCTT 


CTTTACCATG 


AAAAGCACCA 


11340 


TATTTTTGAT 


TTTCAGGAAT 


AGTTAGTGCT 


TTTTGAAAAT 


GATCAATAAT 


ACCTTGTGGC 


11400 


GTGGGCCCAT 


CAGGGATTCC 


AACTGCCATA 


TTAATTAATG 


GCAATGGTCC 


ATGTTCGATT 


11460 


TTACGTCCCA 


TCGTTTTCCC 


GAAATAACTA 


TCAGGGATAT 


TTGCTAATTT 


GTTAGAGATC 


11520 


ATCAAATTCC 


TCCTCTATCA 


TTAAACATAG 


CCTGGGCGAC 


TATCATAATC 


CTAACAACTT 


11580 


GTATCACTCT 


CATTTAGATG 


GTTACAATGA 


CATCGCCATT 


CACCGTTATG 


TTCAACAGAA 


11640 


CTTATGACAC 


ACGTTGTATT 


GAATGAATTT 


ATTTTCATTT 


TAGGTAGGTA 


TAATATTATT 


11700 


GTCAATATTA 


GGAATTTTCA 


GATTAATATG 


CACTCAAT CG 


TTATGATTTA 


ACTGTCATGC 


11760 


ATATCCGCAT 


GCGCAACCAG 


TTAGATATGC 


TTATATAAAG 


TATAACGCCC 


ATCAAGGTAC 


11820 


GTATTCAAAC 


GTGAACCTTA 


ACAGGCGTCA 


TTCATTGTTA 


AATAAAACTT 


CTTAAGCACA 


11880 


TACTTATTTC 


ACTATGCCTT 


TTACGTTCCC 


CTTATACTTT 


TCTCACATCT 


TTCTCTTAGA 


11940 


CTACTCCCTT 


ATACGCCCCG 


CTCAATATCT 


TTAATCATTT 


CATCTACAGT 


TATTTTCGCA 


12000 


CTCGTTAAGA 


CAATAGGAAC 


GCCTGCACCT 


GGATGCGTAC 


TTGCACCTGC 


AAAATATAAA 


12060 


TCTTTATAAT 


CTCGCGATAC 


ATTTTGTGGA 


CGATAATAAT 


TACTTTGCGC 


TAAAGTTGGC 


12120 


ATTAAACCGA 


ATGCCGAACC 


AAATTTCGCA 


TGATACGTTT 


GCTCAAAATC 


ATTTGGCGTA 


12180 


AAGATTGTTT 


CTGAAACAAT 


ATGCGATTTT 


AT AT CTTCAA 


ATACTTCAAT 


CGTTGCTAAT 


12240 


TTACGATAAA 


TAATTTCCTT 


TATTTGTTGC 


GTCAAAGCTT 


CATCTGACCA 


ATCGATTCCG 


12300 


CTACCTGTTT 


TAAGTTCCGG 


CGTCGGCATT 


AG CA CAT AAA 


TACCAGTTTT 


GCCTTCTGGC 


12360 


GCAAGTGATT 


TATCAGCGAC 


CGCTGGTACA 


TACACATAAA 


TAGAAGGATC 


ATATGATAAA 


12420 


CGTCCCTCAA 


ATATTTCTTC 


AATATTG C CT 


CTAAAGT CAT 


CTGAAAAAAT 


AACATTATGA 


12480 


AGTCTCACTT 


GATCTGTCAC 


ATCAATATCT 


ATACCGATAT 


ACATTAAAAA 


TGCTGAACAA 


12540 


GAGTAATCTA 


AGTCTGCAAT 


TTTATGTGGT 


GGATACTTTT 


TAATAGGTGC 


AAAATCTGGC 


12600 
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ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12780 

TGAGCCATGC CAT ACATAC C GCCTTTAATA AAATGCACAC CAAACATCAT TTCAAT CAT A 12840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 

AACGCTAAAA G CTTTTGT AT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12 960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG TCATATTATA AAAGTCACTC 13 020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 1314 0 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACC CATATC AAATGTAAAG 13 3 20 

CCGTCTTTCT TTAATTGATT CATACGCCCG C CTACATT AT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 1344 0 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560 

CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13620 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 136 80 

TAG CTG CAT A ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13740 

-CAGCAATATC-AAGTTGATAT^ TGGTTTAATG-GTTGGTTAGT— AAAATATATG-GGTTGATTGT 1-3800 

CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13856 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
so ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 
AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 120 
AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA 
AATATGAAAG ATTATGGGTT AACAGGCATA 
5 CGTG CGTTAA ATCGTGGAAG ATGTAAACCA 

GATATTTGCA AACCATTAAC GATATATGGC 
ATTTTACGCC GATGTCATTC TGGTCCTTTA 

10 

CGTGGTTATA ATGGACACAG T CAT ATT CAT 
GTATCGTATC CTTATAACAA TACAGCTATG 

y5 ATAGGTGTGA CCATTAAGAA TGTAGTGAGT 

GGACTCTATA TTAAAAGCTG TTCATTTGAA 
ATTCTGAAGC AATACAATTA GACATTCAAG 

20 CAGATGGTAC GATAACGAAA AATGTCATTA 

TGCC CGAAAT GGGAAGTTGG AATCGTGCTA 
ACTATGAGAA TATTCATATT AGAAATAATA 

25 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA 
GCATTAGATA TTTAGGAGTT AGAGATGGTA 
ACTTAGGTTC CCAAGCAGGC ATAAATATGA 

30 

TGTCTAAAGA TGCGATACAT GTACGTAATT 
TCGTTGGGAA TACATTCAAT AATTCGACTC 
35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA 
ATTGATAAAA CGGTATAAAT ATGCTATAAT 

40 

TGACGGTAAT GATAATACAA GATAGACAAC 
AGCTTGTCAT AATCATCATG AGGGGGAAAT 
TGATATCGAA AAGGTATTTA ACATTCTTTT 

45 

CGGGACAACT TATTGGACTA ATATTAGGTC 
TTCATCCACA AGACTTACCT TGGAAAGGCG 
50 CGACTTGGTG GATTACTGAA GCAATTCCTA 

TATTACCATT AGGTCATATA CTTACACCAG 
TTATCTTTTT GTTTTTAGGT GGATTTATTT 

55 
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ATAGACAAAG GATGACGATT TATGTATATC 3 00 

AACAAAACTA AAGATACTCG AGCAATACAA 360 

ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 

TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 54 0 

ATTAAAGGCG GCAAGTTTGA TATGAATGGT 6 00 

TGCATTGGGC ATGCTGAAGA TATTCAATTA 660 

GGTCATGCAA TTGATGCTTG TGGGATTAAC 72 0 

GGATTCATAG ACTATAGTGG CGAACcTTTT 78 0 
TACCTGGTGC TTTTCCAAAA TTCGGAACgA ' 840 

TCGAAGATTG TTATTTTGGA CCTTCAGAAT 900 

TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

TATTTGAAGA TATACAAGGT TATGCATTAA 10 20 

TTAATAATAA GTTTATTAAC TGTGaGGGTG 108 0 

AAAATGCAGC AGATGTGaTG ACAGGaAAAG 114 0 

ATATAATTGG AAATGAATTT AAAGGATCAA 12 00 

ATAATAATGT TAAACATAAA GATGTATTAA 12 60 

AATCAATTCA TTTAGAAGAT ATTGATACAG 1320 

AAGTTACTAC AATCAATGTA GATGAAATAA 1380 

TTAAGAATAG TAGATAATTT TTGAAAGCGC 1440 

AAACCCAATT ATCTGATAAA AGGGGTATTT 1500 

TTT CT AT ACT CTAATATAGT GAGTTGAAGT 1560 

TTATGGCTTA TTTCAATCAA CATCAATCAA 1620 

CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 1680 

CATTACTTTT C CT ATT AACA TTATTATTCT 174 0 

TCTATGTTTT AGCGATTACT TTATGGATTG 18 00 

TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

AACAAGTATC ATCCGAATAT GG CAATG AT A 1920 

TGGCAATTGC , AATGGAAAGA TGGAATTTAG 1980 
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TTGGATTCAT GGTGGCAACA GGATTCTTAT CTATGTTTGT ATCGAACACT GCAG CTGTAA 2100 

TGATTATGAT TCCGATTGGT TT AG CAATT A TTAAGGAAGC ACATGATTTA CAAGAAGCCA 2160 

ATACGAATCA AACAAGTATT CAAAAGTTTG AAAAATCTCT AGTTTTAGCA ATTGGCTATG 2220 

CAGGTACGAT TGGTGGCTTG GGTACATTAA TCGGAACCCC GCCATTAATT ATTTTAAAAG 2280 

GACAATACAT GCAACATTTT GGACATGAAA TTAGTTTTGC TAAATGGATG ATTGTAGGGA 234 0 

TTCCAACGGT CATTGTTTTG TTAGGTATTA CTTGGCTCTA TTTAAGATAT GTTGCGTTTA 2400 

GACATGATTT GAAATATTTa CCTGGTGGTC AGACGTTAAT TAAACAAAAG TTAGACGAGC 24 60 

TTGGCAAAAT GAAGTATGAA GAAAAGGTAG TACAAACTAT CTTTGTACTT GCTAGCTTAT 2520 

TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT GTTGCAGATG 2580 

GTACGATTGC TATTTTTATA TCAATATTAT TATTTATTAT TCCAGCTAAA AATACTGAAA 2640 

AACATCGCCG TATCATTGAC TGGGAAGTTG CAAAAGAGCT CCCTTGGGGT GTATTAATTT 2700 

TATTTGGTGG CGGTTTAGCA TTAGCGAAAG GTATTTCTGA AAGTGGTTTA GCAAAATGGT 2760 

TAGGCGAACA GTTGAAATCA TTAAATGGTG TTAGTCCGAT TCTTATTGTA ATTGTCATAA 2820 

CAAT CTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 2880 

T AC CG ATTTT AGCAACGTTG TCTGTTGCTG TTGGAGTGCA TCCATTACTA CTTATGGCAC 2 94 0 

CTGCAGCTAT GGCGGCTAAC TGTGCATACA TGTTACGAGT AGGGACACCA CCGAATGCAA 3 000 

TTATCTTTGG TTCTGGTAAA ATATCTATCA AACAAATGGC ATCAGTAGGA TTCTGGGTAA 3060 

ACTTAAT CAG TGCAATAATT ATTATTTTAG TCGTGTATTA TGTAATGCCT ATAGTTTTAG 3120 

35 G TATTGATAT AAATCAACCA CTGCCATTGA AATAGTAATT GCAGATTAGA A CGAAAAATA 3180 

AAAGGTTACA TTAGCAATTG CTTGGACGAG TGGTAACGAA ACGTATACCG CAGCATCGTG 324 0 

TAASAACAAT ACAAACAAAA GAAAGTCAAC CAAGGATGGA TTCCTATTTT AATCCTTGGT 33 00 

TGACTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA AAGTTTAATT AAAAGCACCA 3360 

ATCATTTCTA CTTTGAAATC TAAGGTTTCT AAAATAGCAA TGACTTTCTT TATATCGGTT 3420 

GTAATTGCAG AATCAGCCTG AACGAAAAAT CGATACATAC CTAATTGTGT TTTTAAAGGA 3480 

CGAGACTCAA TCCAGGATAA ATTAATATTA AACAAAGCAA ATGTATTAAG CACACTTGCT 3540 

AACAACCCAG GTTTATCATG CATTGGTGTA ATTAAAAACA TCAATGATGT CGCATTTTGA 3600 

TCAAATTGCT GCTGATTTTT TATAACTAAA AAACGTGTCA CGTTATGTGG ATAGTCTTCA 3660 

ATATGTGTAT CAATAGGTGT AAAACCATAA GctTCGCCAC TACCTAAAGG TGCAATTGCT 3720 

GCAACGCCAT TTTCAATTTT AGTCAAACTT TGAATTGTAC TGTCGACATA ATCATAGTCA 3780 
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TTTTTAATAT CAGAAATGGA ATCTGTTCCA TT AC CAT ATA ATGCAAAGTT AATATCTAAA 3900 

CGTATTTCAC CGTGTGCAAA GACATCTTGC TGTGCAAGTG CATCTGCCAC AATGTTGATT 3 960 

GTTCCTTCTA TAGAATTTTC AATAGGGACA ACACCAATCG ATGTGTCATC ATCTGCAACT 4020 

GCCTTGATGA CTTCAAATAA ATTTGACTTT GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 4080 

TACTGACGAC AAGCCAAATA TGAAAATGTA CCTTTAGGGC CTAAATAATA T AATTG CAT A 4140 

TGCTACACCT CTACTAACTT AATGATGGAA AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4200 

TTTATAGAAA AAGTTTGGAT CTTTTACTGT ATTGTCATAT CCGTGATGAT AATTTGACGT 4 26 0 

15 CAATGTTGGA GATAATGGCG GTGCTAGCCA AGACCATTTT CCGGTAACTT GACGACCTTG 432 0 

TTGTGCTTCG TTACGTTCGA ATAGTTCGAA TTGCTTTGCA GCGGTCAAAT GATCGACAAT 438 0 

TGATACGCCT TCTTTTTTAA AGGAATGATA CACAGCATAG TTCAATTCAA CAAGTGCTCG 444 0 

ATCTTTATTA AATGAATTAT TTTTAAGTGT ATCAAATTCA AACGCAT CTG CAACTTTTTC 4 500 

TAGTAAATTG TAACGGTAAT CATCAATAAA GTTACGTACG CCAATTTCAG TTACCATATA 4 56 0 

CCAACCGTTA AAGGGTGCAG TTGGATATAC AATGC CACCG ATTTTTAAGT C CAT ATTGG A 4 620 

AATGATAGGG ACTGCATACC ATTTTAAGTT CAATTTTCTT AATTTTGGAT AATGATTATG 4680 

TTCAATAGGT ACTTCTTTAA TTAATGAAGT AGGATATTCG TAAAATTTAA CTGACTCATT 474 0 

AGGTAATTGG TAAATCAGTG GTAACACGTC AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4 800 

GTGATTTGCT AAGCGTGTAA CTTCTTTTTC AGCAGGATCA CCACAATTGT CATAGCCAGC 4 86 0 

ATAG CGAATT AATTGATTGT TGAAAATTTT AGGTCCATCC TTTGG AG CAT ATATAGTAAT 4920 

35 ATACGGCTTT AATTTACCTT CATTTGTAGC CTGTGTAATA TGATAAGTAA TTGATGATAA 4 980 

GAACGATGCT TCGTCAGTAA CATCTCTTGC ATCAATGACA TTTAACGAAT CCCAAAATAA 5040 

ACGACCAATG CAACGATTTG AATTACGCCA AGCCATTTTA GCACCATAAA TAAGTT CTTC 5100 

TTCTGTATGT GTATATGTCC CAGTTTCTTT TATTTCTAGT TCAATGTCAT GTAAACGTTT 5160 

ATTGATAATT TGCGTTTCAT AATGACACTC TTTATACATG TTTTCTATGA AAGCTTGAGC 5220 

CTCTTTAAAT AACATTAACA ACACCTCGCT TTATATTATA GTCTACATTA TTAAAATACT 5280 

CTTAAAAATT ATGTATATGT CATTAAATTG TTGGTTGATT TTAATTAAAA GTATGGAAAT 5340 

TAAGGGGCTC TTATGTATAT AAAAAAATGA ATTATGATAA AATGTAAGAA AATATTTAGG 5400 

50 TCGATTGGAG AGATACAAGT GTACCAATTA GAAGACGACA GTTTAATGTT ACATAATGAC 5460 

TTATATCAAA TAAATATGGC TGAAAGTTAT TGGAATGATA ATATTCATGA AAAAATGGCT 5520 

GTATTTGATT TGTATTTTAG AAAAATGCCA TTTAATAGTG GCTATGCTGT TTTTAATGGT 5580 
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TTAAAGTCTA TTGGCTACAA GGATGATTTC TTATCATATT TAAAAGATTT AAAATTCACA 5700 

GGCAGCATCC GTTCGATGCA AGAAGGCGAA TTATGCTTTG GTAACGAACC ATTGTTACGC 5760 

S 

GTAGAAGCAC CATTGATTCA AGCGCAATTA ATAGAAACAA TTTTATTAAA CATTGTAAAT 5820 

TTCCATACAT TAATTACAAC AAAGG CTAGC AGAATTCGTC AAATTGCATC AAATGATAAA 5880 

w TTAATGGAGT TTGGTA CACG TCGTGCGCAA GAAATTGATG CAGCATTGTG GGGCGCTAGA 594 0 

GCTGCTTACA TCGGGGG CTT TGATTCTACA AGTAATGTTA GGGCGGGGAA ATTATTTGGT 6000 

ATACCTGTGT CTGGTACACA TGCACATGCA TTTGTCCAAA CTTATGGAGA CGAATATGTT 6060 

15 GCCTTCAAAA AATATGCTGA AAGACATAAA AATTGTGTGT TCCTAGTAGA TACATTCCAT 6120 

ACTTTAAAAT CTGGCGTGCC AAATG CAATA AAAGTTGCAA AAGAATTAGG TGACAAAATT 6180 

AACTTTGTAG GTATTCGATT AGATTCTGGA GATATCG CTT ATTTATCTAA AGAGGCAAGA 6240 

20 

CGTATGCTTG ATGAAGCAGG ATTTACTGAA ACTAAAATTA TCGCGTCTAA TGATTTGGAT 63 00 

GAAGAAACGA TTACGAGTTT GAAAGCACAA GGTGCAAAAG TAGATTCTTG GGGCGTTGGT 6360 

ACAAAGCTGA TTACAGGATA CGATCAACCA GCATTAGGTG CAGTATATAA ACTTGTAGCT 64 20 

25 

ATTGAAAATG AAGATGGTTC ATATAGTGAT CGTATTAAAT TATCAAATAA CGCTGAAAAG 64 80 

GTTACGACGC CAGGTAAGAA AAATGTATAT CGCATTATAA ACAAGAAAAC AGGTAAGGCA 6540 

30 GAAGGCGATT ATATTACTTT GGAAAATGAA AAT C CAT ACG ATGAACAACC TTTAAAATTA 66 00 

TTCCATCCAG TGCATACTTA TAAAATGAAA TTTATAAAAT CTTTCGAAGC CATTGATTTG 6660 

CATCATAATA TTTATGAAAA TGGTAAATTA GTATATCAAA TGCCAACAGA AGATGAATCA 6720 

-35 GGTGAATA^— TAG^CTAGG-ATTACAATCT 6780' 

CCACAAGAAT ATCCAGTCGA TTTAAGCAAG GCATGTTGGG ATAATAAACA TAAACGTATT 6840 

TTTGAAGTTG CGGAACACGT TAAGGAGATG GAAGAAGATA ATGAGTAAAT TACAAGACGT 6900 

40 

TATTGTACAA GAAATGAAAG TGAAAAAGCG TATCGATAGT GCTGAAGAAA TTATGGAATT 6960 

AAAGCAATTT ATAAAAAATT ATGTACAATC ACATTCATTT ATAAAATCTT TAGTGTTAGG 7020 

45 TATTTCAGGA GGACAGGATT CTACATTAGT TGGAAAACTA GTACAAATGT CTGTTAACGA 7080 

ATTACGTGAA GAAGGCATTG ATTGTACGTT TATTGCAGTT AAATTACCTT ATGGAGTTCA 714 0 

AAAAGATGCT GATGAAGTTG AG CAAG CTTT GCGATTCATT GAACCAGATG AAATAGTAAC 7200 

SO AGTCAATATT AAGCCTGCAG TTGATCAAAG TGTGCAATCA TTAAAAGAAG CCGGTATTGT 7260 

TCTTACAGAT TTCCAAAAAG GAAATGAAAA AGCGCGTGAA CGTATGAAAG TACAATTTTC 7320 

AATTGCTTCA AACCGACAAG GTATTGTAGT AGGAACAGAT CATTCAGCTG AAAATATAAC 7380 
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TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7560 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7620 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 76 80 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 774 0 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7800 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7 86 0 

75 CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 792 0 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 798 0 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 804 0 

AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 822 0 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 82 8 0 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 834 0 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 84 0 0 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 84 60 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8 520 

35 TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 864 0 

TGAAAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8700 

CGTTTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8 820 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 888 0 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AG AG AC CTTT AATAAGATTA ATAGTTTACT 9000 

TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTT CAT A ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAG CTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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CATTTAAATC TTGAGGATGC CATTCTCCCT CAATAATATT AAGATAATAC TTAGCCTCTG 9300 

AATTACATTT GAATTTATCA ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 9360 

TTCTAAAATA CAAATTTTAA TAAC CATAAA TAGATGAATA CCATCGATAA TGGTCGCCAT 94 20 

TGGATACTGG AATAACATTG TTTTTAGCAT CTTGAGTCAT AAAACCATTA TCCCATGGAT 94 8 0 

TCCATATAAT TATAAC CTCT TGTCCATTAT CTAATTTAGC GTTCCCAACA ACTGCCATGG 954 0 

CATGCCCTGC GTGCATACCA TTTCTTGATT CTACTCTACT ACCTAAAACA GCAATTCCTT 960 0 

TATTATTTTT AGTAAGATTG TCAACTTCAT TATATGTAGT CATTCTATTA AGAAGTTGTG 966 0 

GACTTCTTCC CTGAGTTTGT CCAAAATAAA TCATCTCTCT TGGCGTTAAA CCAGTAAATT 9720 

GGAATCGTTG TCCTTGTAAG TTTGGGTGTA AAAATCTCAT CACAGCTTCT GCATGATATT 9780 

TGTT AG T ATT ATAAGTCGCA TTTAGTAATT CAGACATCGT ATAGCCTGCA CACCAACCAT 984 0 

20 TGTT AC CTTG AGTTTCTCTT ATCTTGAAAT TCTCAAGTTT ATTTATATAT TGsTCGTTGT 9 900 

AAGTATAATT ATTACTTTTA AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 996 0 

GTTGCGTTAC ATTATTGCCA GTAGGTATAC TCTCAGTCTT TnTnAACTnT nTATCTTCTA 10020 

GACGTGGTGT TTTTAGTACT AGTTTAGCTT TATGATTTTG AGTACCACAT AGTAACCTTT 1008 0 

TGAGTTGT 10088 
(2) INFORMATION FOR SEQ ID NO: 33: 



25 



30 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



T (Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 60 

TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 120 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 180 

45 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 24 0 

ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 300 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 360 

so 

TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATGCAA ATCACAATAG CTGTTGAGGA 420 

TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 480 
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TGCCACACTC CTTTTTGATT GAATTAGCAT 

GTATTTGAAC ATAAAAATGT AATTTTATCG 

5 

GTAATTTATG ATTGAAAAGT GAAAGCGTAC 

GATGATAATT ACTGaAAAAA GACACGAGTT 

TTTGACTTTA CAAGAATTAA TAGATCGAAC 

10 

TTTATCTAAA CTACAACAAT TAGGGAAATT 

AGAAAATCGT ATGGTTGAGG CGAATTTAAC 

15 GAAAATGATT GCTAAAATAG CAGCTAATCA 

TGCTGGTTCA TCTACATTGG AGCTAATTAA 

AACCAATGGT TTAACACATG TAGAAGCTTT 

20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC 

AAGACGATAT TGTTTCGATA AAGCTTTTAT 

ATTAACTACT CCCGATGAGC AAGAGGCATT 

25 

TCAATCATTT GTACTTATAG ATCATTCTAA 

TTTGCTAGAA AGTACGACAA T CAT CACAT C 

AGAATACCAA CAAAAGTATC ACTTTATAGG 

30 

AATCCTTCAA TTGACTATGT CATTTTTACG 
GCAACAGCAA CATATAAATT CGCTGGGGGG 

3S ACATTGGATG TTGAGTCAAC TGCCTTGGGA 

ATAGATACAT TAAATAACAG TGCAATTCAA 
CGTAJTAATG TGAAATTAAA AACAGGACAA 

40 ATAACGTCAA CACAATTTGA ACAACTGTTA 

ATAGTTATTG TTGCTGGAAG TGTACCAAGT 
GCACAAATTA CAG CACAGAC AGGTGCTAAA 

45 

GAAAgCGTTT TACCATATCA TCCACTATTT 
ATGTTTAATA CAACAGTGAA CTCAGACACA 
GATAAAGGTG CGCAATCTGT TATTGTCTCG 

50 

AAAGAAATCA GTATTAAAGC AGTTAATCCA 
GGTGATAGTA CAGTTGCAGG CATGGTGGCT 
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TTTACGATCA TAAACAGTCA TTATAATTGA 600 

TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 

TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 

AATATTAGAA GAACTTTCGC ACAAAGATTT 780 

TGGTTGCAGT GCTTCAACAA TACGArGAGA 84 0 

GCAACGTGTG CATGGTGGTG CAATGTTAAA 900 

TGAAAAATTA GCAACGAATC TTGATGAAAA 96 0 

AATCAACGAT AATGAATGCT TATTTATCGA 1020 

ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

ACTTAAAAAA GGTATTAAAA CAATTATGCT 114 0 

TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

CGGGATGAAT GGATTAGATA TTGAACTTGG 1260 

AGTTAAACAA ACAGCAATGT CATTAGCCAA 132 0 

GTTTAATAAA GTATATTTTG CTCGTGTACC 1380 

TGAAAAAGCA TTAAATCAAG AATCGTTAAA 144 0 

AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

AAAGGTATTA ATGTCT CGCG CGTCTTAAAG 162 0 

TTTGCAGGTG GATTTCCTGG GAAATTCATT 1680 

TCGAATTTTA TTGAAGTTGA TGAAGATACA 1740 

GAAACAGAAA TCAATG CACC GGGTCCTCAT 1800 

CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

AGTATTCCAA GCGATGCGTA TGCGCAAATT 1920 

TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

ATTAAACCTA ATAAAGATGA ATTAGAAGTG 2040 

GATGTTATTA AATATGGTCG TTTGTTAGTT 2100 

CTTGGCGGTG ATGGTGCTAT TTATATTGAT 2160 

CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 2220 

GG AATTGCTT CAGGTTTAAC GATTGAAAAA 2280 
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CGGGACGCTA TAGAAAAAAT AAAATCACAA GTTACGATTA GCGTACTTGA TGGGGAGTGA 2400 

AAATAATGAG AGTAACAGAG TTATTAACAA AAGATACAAT AGCAATGGAT TTAATGGCAA 24 60 

ATGACAAAAA TGGTGTTATT GATGAGTTAG TAAATCAATT AGACAAAGCA GGTAAATTAA 2520 

GTGATGTCGC GTCATTTAAG GAAGCGATTC ACAATCGAGA ATCACAAAGT ACAACTGGTA 2 580 

TCGGCGAAGG TATTGCCATT CCACATGCCA AAGTGGCCGC AGTTAAGTCA CCAGCTATTG 264 0 

CGTTTGGTAA ATCTAAAGCA GGCGTAGATT ATCAAAGTTT GGATATGCAA CCAGCACACT 270 0 

TATTCTTTAT GATTGcAGcG CCAGAAGGTG GCGCCCAAAC ACATCTAGAT GCTTTAGCTA 276 0 

AGTTGTCTGG TATTTTAATG GATGAAAATG TACGTGAGAA ATTATTACAT GCTTCATCAC 282 0 

CTGAAGAAGT ACTAGCGATC ATAGATGAGG CTGATGATGA AGTGACAAAA GAAGAAGAGG 288 0 

CAGAAGCTGA AGCACAACAA GTTGCAACTG CAGAACAATC ATCTAAACAA TCTAATGAGC 2940 

20 CATATGTGTT AG CAGTAACT GCTTGTCCAA CAGGTATTGC ACACACATAT ATGGCACGTG 3000 

ATGCATTGAA AAAGCAAGCG GATAAAATGG GTATTAAAAT TAAAGTAGAA ACGAATGGTT 3060 

CAAGCGGCAT TAAAAACCAT TTAACTGAAC AAGATATTGA AAATGCAACA GGT AT CATTG 312 0 

TTGCTGCTGA TGTTCATGTT GAGACGGATC GCTTCGATGG TAAAAATGTC GTAGAAGTAC 3180 

CAGTAGCAGA TGGTATTAAA CGCCCAGAAG AATTAATTAA TAAAGCATTA GATACAAGTC 324 0 

GTAAACCTTT TGTTGCC CGT GATGGTCAAA GAAAAGGTAA CTCAAATGAC AGTCAAGAAA 3300 

AATTAAGCCC AGGTAAAGCA TTCTATAAAC ACTTAATGAA CGGTGTTTCT AACATGTTGC 336 0 

CACTTGTAAT ATCTGGTGGT ATTTTAATGG CAATTGTATT TTTATTTGGA GCAAATTCAT 3420 

TTAATCCAAA AAGCTCAGAG TACAATGCGT TTGCAGAGCA GCTTTGGAAC ATTGGTAGTA 34 80 



25 



30 



J3A 



AAAGTGCATT CGCGTTAATC ATTCCAATTT TATCTGGATT CATTGCACGT AGTATTGCGG 354 0 

ATAAACCTGG TTTCGCTTCA GGTCTTGTAG GTGGTATGTT AGCAATTTCA GGTGGTTCAG 3600 

40 GATTTATTGG TGGTATTATT GCAGGTTTCT TAG CAGGTT A CTTAACACAA GGTGTTAAAG 3660 

CCATGACACG TAAGTTACCA CAAGCATTAG AGGGATTAAA GCCAACATTA ATTTATCCAC 3720 

TATTAACAGT GACGGCTACA GGCTTATTGA TGATTTATGC CTTTAATCCA CCAGCATCTT 378 0 

GGTTAAATCA TTTGTTATTA GATGGATTAA ACAATTTATC AGGTTCTAAT ATTGTATTAT 384 0 

TAGGTTTAGT TATTGGCGCT ATGATGGCGA TTGATATGGG CGGTCCATTC AACAAAGCGG 3900 

CATATGTTTT TGCAACAGGT GCGTTGATTG AAGGTAATGC AGCACCAATT ACAGCTGCAA 3960 

TGATTGGTGG TATGATTCCA CCGTTAGCAA TTG CGACAGC GATGTTAATT TTTAGACGTA 4020 

AATTTACAAA AGAACAACGT GGTTCAATTA TCCCTAACTA TGTGATGGGT ATGTCATTTA 4 080 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG CTTTAGGCTT AGGTTCACGA ATTACTGCGC 42 00 

CACATGGTGG TATTATTGTA ATTGTTGGTA CTGATGGTGC ACACTTACTT CAAACTCTTA 42 6 0 

TTGCACTTCT AGTTGGCACA TTAGTTTCAG CATTAATTTA CGGTTTAATC AAACCAAAGT 4320 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT CAATGGACGA GTAGTTTTAA TGATGTAAAA 4330 

TGATTG TT AG CAAAGAGCTT CATATTAAGT TGTATGTTCA ATGAATATAT GTTAGTTTTA 4440 

TATATCGTGT TAACGGTAGC TTATACAAAG CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4500 

TTATGAATTG ATATGAAAGT GTTTTTATTT TTAGATAAAT GAATGAAGAA AT AG ACAC CA 4560 

CAAATGTATA GACTTTTTTA ATATTTTGCA AAAAG TTATG CCAAACGAAG CAGATATAGT 4 62 0 

AAAATATGAG TGTCTTAAAG TGAAAATTTA TAAATAAAGA AGGGTTTATA CGTGTCAGAA 4 68 0 

TTAATTATAT ATAACGGCAA AGTTTATACT GAAGATGGCA AAATCGATAA TGGTTACATT 4 74 0 

20 CATGTGAAAG ATGGACAGAT TGTTGCAATT GGAGAAGTGG ATGATAAAGC AG CAATTG AT 4 800 

AATGATACGA CAAATAAAAT TCAAGTGATT GATGCTAAAG GTCATCATGT ATTACCAGGT 4 86 0 

TTTATTGATA TACATATTCA TGGTGGTTAT GGTCAAGATG CAATGGATGG GTCATACGAT 4 92 0 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG TCTGAAGGGA CGACATCATA CTTGGCCACT 4 980 

ACAATGACGC AATCGACTGA TAAAATAGAT AATGCACTTA CAAATATTGC TAAATATGAA 504 0 

GCGGAgCAAG ATGTTCACAA TGCAGCGGAA ATTGTAGGTA TACATTTAGA AGG AC CATTT 5100 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT CCGCAATACG TTGTACGCCC ATTTATCGAT 516 0 

AAAATTAAAC ATTTTCAAGA GACTGCTAAC GGATTAATAA AGATTATGAC GTTTGCACCT 5220 

GAAATTGAAG GTGCAAAAGA AG CGCTTGAA ACGTATAAAG ATGACATTAT TTTTTCAATT 528 0 

GGTCATACAG TAGCAACATA CGAAGAAGCA GTTGAAGCTG TTGAGCGAGG AGCTAAACAT 534 0 

GTCACGCATT TATATAATGC AGCGACGCCA TTCCAACATA GAGAACCAGG TGTTTTTGGA 54 00 

40 GCAGCATGGT TGAATGATGC TCTACATACC GAAATGATTG TTGATGGCAC TCATTCTCAT 5460 

CCGGCATCGG TTGCAATTGC TTACCGTATG AAAGGTAATG AACGTTTTTA TTT AATT AC C 5520 

GATGCAATGC GTGCAAAAGG TATGCCTGAA GGAGAATATG ATTTGGGTGG ACAAAAAGTA 5 580 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA AATGGTGCGC TTGCTGGTAG TATTTTAAAA 564 0 

ATGAATCATG GGTTACGTAA CTTAATATCA TTTACAGGTG ATACATTAGA TCATTTATGG 5700 

CGAGTAACAA GTTTAAATCA AGCCATTG CA TTAGGTATCG ATGATAGAAA AGGTAGTATT 5760 

AAAGTAAATA AGGATGCAGA TCTTGTTATT CTAGATGATG ATATGAATGT AAAATCTACA 5820 

ATAAAACAAG GCAAGGTTCA CACATTTAGC TAATAAATAA TCATAATTAA ATGTATGCAA 5880 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 606 0 

AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6X20 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 624 0 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 636 0 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 64 80 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 654 0 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6 66 0 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6780 

TGTAGACGAA TTACTAGAAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 684 0 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6 900 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 696 0 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7 020 

3S TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 7140 

TGAT5ATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 720 0 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 7260 

TAATAC CAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 7320 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAG CGAG 73 80 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 744 0 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7560 

CGA 7563 
(2) INFORMATION FOR SEQ ID NO: 34: 
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<A) LENGTH: 34 92 base pairs 

<B) TYPE: nucleic acid 

<C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

TT AT AT CAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 6 0 

SATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 120 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

75 ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 24 0 

G CAAGAGG AC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 3 00 

AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT 360 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 42 0 

GTG CTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 4 80 

AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAGCTCG TGTAAAGGAT 54 0 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AG CAG CATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 660 

GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 720 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATTT 780 

TATTAAAACA T CTTTCT AT A TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

35 TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGC CTA AAATATCCAG CACTGTAAAT 960 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 1020 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 1080 

CGTTCAAAAA ATCTAGG CGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 1140 

CGATTAATTT CCAATTGTTT TGTTTTAATG C CAT AAATG A TATCTTCTGC AAGCTGATTA 1200 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 1260 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 1320 

50 AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATG CCCCATA ATGTG CAGCA 1380 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATAC CTA CAAT ATGTG C GTTAGATGTT 1440 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 1500 
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TAAATGAATC CATCGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 1680 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 1740 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 1860 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATG r AAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 198 0 

75 AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 204 0 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 2 280 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGG CATTA 2340 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAG CAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT AT CACCGGAA 24 60 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCsAAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 264 0 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700 
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ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACAgACCCG CCAACACATG T ATG CAT CAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2820 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG T CATT AG ATT TTAAAACTAT 2880 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2940 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3000 

CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3 060 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 3120 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 324 0 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT TTTAACGCTT 3300 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 3420 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 80 

5 TCCACATATG CT 34 92 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
io (A) LENGTH: 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 60 

20 CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 120 

TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 180 

CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 24 0 

AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 300 

TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 360 

TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 420 

AGGATTAGCT TTTGTAG CTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 480 

GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 54 0 

GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 600 

AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 660 

CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 720 

40 AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 780 

CTTTAAAATA ACAATTGCnG GTGGTCAAGG , CCATCTTAAA GGTCAAATTT TnAGAATTGG 840 

TCATATGGGG AAAATTAGT C CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 900 

45 TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 960 

TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 1020 

CATTATTAGA TCACGAACAA TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 1080 

50 

TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 1140 

AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 1200 
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GTAATACGAT 


TTCAGCTACT 


GAACATACAC 


TGGCAATGTT 


ATTATCAATG 


GCACGAAATA 


1320 


TTCCGCAAGC 


ACACCAATCA 


CTTACAAATA 


AAGAATGGAA 


TCGAAATGCA 


TTTAAAGGTA 


1380 


CTGAGCTTTA 


TCATAAAACA 


TTAGGTGTCA 


TTGGTGCTGG 


TAGAATTGGT 


TTAGGTGTTG 


1440 


CTAAACGTGC 


GCAAAGTTTC 


GGAATGAAAA 


TACTAGCTTT 


TGACCCTTAC 


TTAACGGATG 


1500 


AAAAAGCAAA 


ATCTTTAAGC 


ATTACGAAGG 


CAACAGTTGA 


TGAGATTGCC 


CAACATTCTG 


1560 


ATTTCGTTAC 


ATTACATACA 


CCACTAACAC 


CTAAAACAAA 


AGGCTTAATT 


AATGCTGTCT 


1620 


TTTTTGCCAA 


AGCAAAACCT 


AGTTTGCAAA 


TAATCAATGT 


GGCACGTGGT 


GGTATTATTG 


1680 


ATGAAAAGGC 


GCTAATAAAA 


GCATTAGACG 


AAGGACAAAT 


TAGTCGGGCA 


GCTATCGATG 


1740 


TGTTTGAACA 


TGAACCTGCA 


ACTGACTCGC 


CTCTTGTTGC 


ACATGATAAA 


ATTATTGTTA 


1800 


CACCTCATTT 


GGGTGCTTCA 


ACAGTCGAAG 


CTCAAGAAAA 


AGTGGCAATT 


TCTGTTTCAA 


1860 


ATGAAATCAT 


CGAAATTTTA 


ATTGATGGTA 


CTGTAACGCA 


TGCAgTGAAT 


GCACCTAAAA 


1920 


TGGACTTAAG 


CAATATAGAT 


GATACTGTAA 


AATCATTCAT 


CAATTTAAGC 


CAA 


1973 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 



TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 120 

AAAT5AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC C CATCGGCAT TAATGTAAAA 180 

40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 240 

GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 3 00 

GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 360 

ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 420 

AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 480 

GCCATAT AC C AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 54 0 

AATAAAACGA ATG ATTT CAT AAAACCTACT TGAGGTAATT GTTC CATTGT AATCTCCCTT 600 

TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 660 
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GTAAAATGAA AACCCGCTAC AAGTACACAT CTATATGGAG ACTCATTTGA AAGTCAACGC 7 80 

TTCGTTAACT ATACTAAAAA TATGTCATAC TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 840 

CTATGCAAAT AAAATATTCC AT AACAAAG T ATATACTTTA CATTTTTATA ATTCTTAACA 900 

ATACTATTTT ATCAAACATT TACCACAATA AAAATATCTT TTTCATTTTT ATTTAAATTA 960 

ATCATATAAT TGCGAGGAGA ATATTATGGA TTTCGTTAAT AATGATACAA GACAAATTGC 102 0 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA TCAGGATACC ACTCAAACGT ATACAGGCTA 10 80 

CATCGTGGAA ACGGAAG CTT ACTTAGGTTT GAATGATCGT GCGGCTCATG GCTATGG CGG 114 0 

1S TAAAATAACA CCTAAAGTCA CGTCATTATA TAAACGTGGT GGTACAATTT ATGCACATGT 12 00 

CATGCATACG CATTTACTCA TTAATTTTGT AACAAAATCT GAAGGTATAC CTGAAGGCGT 12 60 

ACTTATCCGC GCAATTGAAC CAGAAGAAGG TTTATCCGCT ATGTTC CGTA ACAGAGGTAA 13 2 0 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG AAAATGGACT AAGGCATTTA ACATTCCACG 1380 

GGCTAT CGAT GGCGCTACGT TAAATGACTG TAGATTGTCT ATTGATACTA AGAATCGTAA 14 4 0 

ATATCCTAAA GATATTATTG CTAGTCCACG AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

ACATAAATCT TTACGTTACA CAGTGAAAGG TAATCCATTT GTGTCTCGCA TGCGTAAATC 156 0 

AGATTGTATG TTTCCCGAAG ATACTTGGAA ATAAATGCCA TCTTTCATTG ATTACTATCA 162 0 

TGAAAATGAA ATCTATCTCC TTATAAGTCA ATCAATCGTG CCGTCAACAT GCGGATGGGT 16 8 0 

TGATTGTTTT TCTTTGTATC CAT CAT ATTT TTTGATTCAT CTCCTCTTAT TGAACTTGTT 174 0 

CTTAATTATA AAATATAACA ATAGAATTAT TTATAATTAT TAAATTTAGA TGCATTAATA 1800 

TTATTGATAT TATTTTCAAA AACTAGAAAT ATTGATTTGT TGCATGTATA ATGTTAAAAG 186 0 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC TTATTTAGGG AGAGGGATAT TCAACAAGGG 192 0 

GGATTTGAAA ATGATAGAAC TTAATGCAAT TACAACATTA TGTTTAG CTT GTATCCTTTA 1980 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT TAATTTTTTA AAACGTATTT GTATACCAGC 204 0 

ACCAGTGATT GGCGGCTTAA TCTTTGCTAT - TTTAGTTGCG GCTTTGGATT CATTTGGCAT 2100 

GGTTAAGATT AAATTAGATG CTTCATT CAT TCAAGATTTC TTCATGTTAG CATTCTTTAC 2160 

GACAATCGGT CTTGGTGCAT CATTGAAATT ATTTAAATTA GGTGGCAAAG TCTTGCTATT 2220 

ATACTTTATG TTTTGTGCTA TCATTTCAGT CATTCAAAAC ATAGTTGGTG TATCACTAGC 2280 

AAAAGTATTA AATATTAAAC CTTTGTTAGG ATTAACAGCA GGTTCCATGT CTATGGAAGG 234 0 

CGGTCATGGT AATGCTGCTG CTTATGGTAA GACAATTCAA GATTTAGGTA TTGATTCGGC 2400 

ACTGACAG CG GCTCTTGCAG CTGCAACTTT AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 2460 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA CGAACATTTA CATAGTAAAT 
TGAAGTATTC TTCATTCAAT TTACAATCGT TGTATTCTGT ATGGCAGTTG 

5 

CAGTCATTTG TTTACAGCTC AAACAGGGAT TAATGTTCCA ATTTACGTTG 
TGTAGCTGTT ATTGTCCGAA ATATCTCTGA AAGTTTTAAT TTTAATATTG 
AATTACTAAT CAAATTGGCG ATGTCGCATT AGGTATTTTC TTATCTCTTG 

10 

CATTCAATTA AT CGAAATTT ATAAACTTGC TATACCTCTT ATTATTATCG 
AGTTGTCGTT ATGATTTTAT TTGCTGTTTT AATTTTATTT AGAGGTTTAG 

T5 TGATGCTGCA GTAATGGTAG GTGGTTTTAT CGGTCATGGG CTTGGTGCAc 

ATGGCAAATT TAGATGTTAT TACTAAAAAA TATGGAAACT CACCTAAAGC 
GTACCTATTG TTGGTGCATT CTTAATCGAT TTAATTGGTG TTATAGTCAT 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA TAAATAAAAG AGGAGGCCTT 

TTTATTT AT C CTCGATGTAT ATTCAAGTTA CGTTGTTCTA TCCATGACAA 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG TCAATATTTT TAGCATCTAA 

25 

ATTGATTTCA TGTGTTCAAT AAATGATTCT ACATAAGCTA CTGTATGTGC 

TTTTCAACTT GATTTAAAAA CGGACGTGAC ATACCAGTTG CCTTTGCACC 

CTTTTAATTG CATCGAGTGG TGTACGTAAA CCACCACTCG CGAAAACTGA 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA ACTGTAGACT GTCCCCATGA 
TCCATATCTT TATTTGCACG ACGTTCATTT TCAATATCTA CAAAGTTAGT 
3S CCACTAACAT CGACATACTT GACGCCTATT TGTTGTAAGT CATGCATTAA 
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ATACCAAATC 


CAACTTCTTT 


TATAATGACT 


GGAACAGACA 


CTCGTGATAC 


AATCGACGCT 


3720 


ATA*CTATCTA 


ACCAAGTCAC 


AAATTCACGA 


TTCCCTTCAG 


GCATAACTAA 


TTCTTGAGGA 


3780 


GAATTAACAT GGATTTGTAA 


CGCTTGTGCC 


TCAAGTAATT 


CAACTGCTTC 


CAAAGCCTTT 


3840 


TCTACTGGTA 


CGTCCGCACC 


AACATTGCTA AAAATCATGC 


CTTCAGGATT 


CATTTTTCGC 


3900 


GCAATCGTAA 


ACGTCTCAGC 


CATGCGTGGA 


TTTCTCAATG 


CCG CATGTGT 


TGATCCAACT 


3960 


GCCATCGCTA 


AGCCAGTTTC 


TCTTGCAACT 


ACAGCTAGCT 


TTTCATTGAT 


GTTTTTCGTC 


4020 


CACTCGCTAC 


CACCCGTCAT 


TGCATTAATA 


TAAACCGGAT 


ATGCCATCGT 


TAAGTCAGGC 


4080 


GTCTGTGATG 


TCAAATCGAT 


ATCATTTACA 


TTAATTGATG 


GGATAGAATG 


ATGCACAAAA 


4140 


CGCATCTTAT 


CAAAATCTGA 


ATGCATTGCG 


TCAGATTGGG 


CCATTGCTAT 


TTCAACATGT 


4200 


TCATm-lTC 


TCTGTTCTCT 


TTGAAAATCA 


CTCATGATTA 


AACCTACCTT 


TTCGTCATTT 


4260 
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TTAATGCCAC 2580 

GAAGTTATTT 264 0 

GCTCATTATT 2700 

TAGATTTAAA 2760 

CGCTAATGAG 2 82 0 

TTTTAGTTCA 2 8 80 

GAAAAGATTA 2 94 0 

GCCAAATGCC 3 000 

ATATTTAGTT 3 06 0 

TATGGGATTC 3120 

CGCCTCcTcT 3180 

TATTTCCGGA 324 0 

CATCGTCATT 3 3 00 

AATG C CATT A 33 60 

AAGTGCTAAA 3420 

AATTTCGCTT 34 80 

TGATAAGTAA 3 54 0 

ACCACCTTTG 3600 

TTCTTTGCTC 3660 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4 380 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 444 0 

5 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCT CGT ATT 4 500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 56 0 

w ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4 620 

GTATCAATTA G CTCTTG CAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4 68 0 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 4 74 0 

75 AAATGATAAA CACTATCTTC AAGTG CAT AT ACAAAGTTGA AATATTTATC AACCATCATA 4 800 

TCTAAAATTA ATATGACGAC ATCTG CACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4 860 

TTATAGACTA CTTTATTTAA TG ATT C CAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 920 

on 

TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4 98 0 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 504 0 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

25 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 516 0 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT AT CATG AATT AAATAACGTG TATGTCTTAA 5280 

30 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 5400 

3S ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 5460 

AACGCGAAAA TTT AG CAT AT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 5580 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACT AG CAT CA ACAAATGAAG 5640 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

45 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 5880 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 594 0 

SO 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 6060 

55 



NSDOC1D: <EP 07865 19A2_I_> 



332 



EP 0 786 519 A2 



5 



10 



20 



25 



30 



35 



ATAATTTTTT 


AGATCAATTT 


TATCAAATTA 


AAGGGCAATA 


CTTTATCATC 


ACACATATCA 


6180 


ATACACTTAT 


TGGTGATTTT 


CACTCAGAAG 


CTCATTAACA 


ATTAGTCTAT 


ATAACCCTTG 


6240 


CTATATTTTC AAAAACAAAA 


CCCAATTACG 


TTTTCATGTC 


AAATATCATC 


TTG CATG AAA 


6300 


TCGTAACTGG 


GTCATTTATA 


TGTTATTAGT 


TATTTTGTGT 


TACATCCTCA 


TCTATCGATT 


6360 


TGGCAATTTG 


TTTAATAGCT 


TTATGTGATT 


GTCTAATTGG 


ATAAATTGGA 


AAATCATGTA 


6420 


CCATCTTAGG 


ATAATCATAA 


AACTCAATGT 


ATTGATGATG 


TTGCAACATC 


ATTTGTTCAA 


6480 


ATAGCTTCAT 


ATCAGGATGT 


GTCATTTCAC 


GTCCACCACC 


AAACATATAA 


ACTGGTGGCA 


6540 


ATCCTTCTAT 


TGTGCCATTA 


ATTGGCGATA 


TGCGCTTATC 


TGTTAATGGT 


AGGCCATTCG 


6600 


CCCATTTTTT 


CATAATCTCA 


TTGACACCAA 


ACTGACTTAG 


aACCGCATCT 


TGTTCGATTA 


6660 


AGGCGTCCGA 


AATATCTTTA 


TTAGATAGTG 


TTGCATCTAA 


AATTGGTGAG 


ATTAAATACA 


6720 


ATTTATTCGG 


TAATGGCTGT 


TGATTAkCTA 


AAAGAGATTG 


TACAAAGGAT 


AATGCCAGTG 


6780 


CACCACCTGA 


ACCATCACCC 


ATGACTACGA 


CATTTTGATG 


TCCTACTTCA 


GATACTAATT 


6840 


GaTCATAAAC 


ACGTTGTATC 


GCTTGGnAAA 


GTATCGTCaA 


TATGnAAACT 


CTGGTGTCTT 


6900 


TGGATAGATA 


GGCAGTACAA 


CCTCATATAA 


TGtACTTAAA 


GTGATTTTAT 


CCCAACAATC 


6960 


TCCAATGGAA 


CGGTGATGGT 


TGTAGTGCAT 


TGAATCCACC 


GTGAATATAT 


AAAATTTTCT 


7020 


TATCAATTTG 


ATGTCTGAAA 


TTAAAGCGAA 


AGACTTGCAT 


ATCATCTAAT 


GACAATTTTr 


7080 


CTAAATTTGC 


TTTAACATTT 


AATGTTGAAG 


GCTGCTTATG 


TTTTTTTCTA 


TTTTCAATTT 


7140 


CTCTTTTATA 


AAAAAATCTT 


TCAACATCTT 


GATCATTTTT 


AAACATAATC 


GAGCGATTGT 


7200 


GAAG CAAATA 


TTTATTGACA 


ACGCTATTCA 


TAACACGGTT 


TCTAATCAAT 


GTCTTAACCT 


7260 



ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 7320 
GTATATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 73 80 



40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 7440 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACCATTTG ATGAGACATT 7500 

TCAACAACCG AATATTCATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 7560 

45 CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 7620 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GTCATtACCG amTTTC t TAG AaTCATTTAA AGATGATAAA TATACAAACG TTGGTAATTT 6 0 

AAAAGAAGTG AATTTTGATA AAATTGCTGC GACGAAACCC GAAGTAATCT TTATCTCTGG 120 

ACGTACAGCT AATCAAAAGA ATTTAGATGA ATTCAAAAAA GCTGCACCTA AAGCGAAAAT 180 

TGTTTATGTT GGTGCAGATG AAAAGAACTT AATTGGTTCA ATGAAACAAA ACACTGAAAA 24 0 

TATCGGAAAA ATTTACGATA AAGAAGATAA AGCTAAAGAA TTAAATAAAG ATTTAGATAA 3 00 

CAAAATTGCT TCAATGAAAG ATAAAACGAA AAACTTCAAT AAAACTGTTA TGTATTTACT 36 0 

AGTTAACGAA GGTGAATTAT CAACATTTGG ACCTAAAGGT CGTTTTGGTG GATTAGTTTA 420 

CGATACATTA GGATTCAATG CAGTTGATAA AAAAGTAAGT AATAG CAATC ATGGACAAAA 480 

TGTTTCTAAC GAATATGTTA ATAAAGAAAA TCCAGATGTT ATTTTAGCGA TGGATAGAGG 54 0 

20 TCAAG CGATA AGTGGTAAAT CAACTGCGAA ACAAGCATTA AATAATCCTG TATTAAAAAA 600 

TGTTAAAGCA ATTAAAGAAG ACAAAGTATA TAATTTAGAT CCTAAATTAT GGTACTTTGC 6 60 

AGCTGGATCA ACTACAACTA CAATTAAACA AATTGAGGAA CTTGATAAAG TTGTAAAATA 72 0 

25 ATTTTAAAAG AGGGGAACAA TGGTTAAAGG TCTTAATCAT TGCTCCCCTC TTTTCTTTAA 78 0 

AAAAGGAAAT CTGGGACGTC AATCAATGTC CTAGACTCTA AAATGTTCTG TTGTCAGTCG 84 0 

TTGGTTGAAT GAACATGTAC TTGTAACAAG TTCATTTCAA TACTAGTGGG CTCCAAACAT 900 

AGAGAAATTT GATTTTCAAT TTCTACTGAC AATGCAAGTT GGCGGGGCCC AAACATAGAG 96 0 

AATTTCAAAA AGGAATTCTA CAGAAGTGGT GCTTTATCAT GTCTGACCCA CTCC CTATAA 102 0 

TGTTTTGACT ATGTTGTTTA AATTTCAAAA TAAATATGAT AGTGATATTT ACAGCGATTG 108 0 

TTAAAC CGAG ATTGGCAATT TGGACAACGC TCTAC CATCA TATATTCATT GATTGTTAAT 114 0 

TCGTQTTTGC ATACACCGCA TAAGATTGCT TTTTCGTTAA ATGAAGGCTC AGACCAACGC 1200 

TTAATGGCGT GCTTTTCAAA CTCATTATGG CACTT AT AG C ATGGATAGTA TTTATTACAA 1260 

CATTTAAATT T AATAG CAAT AATATCTTCT TCGGTAAAAT AATGGCGACA scgTGTTTCA 1320 

GTATCGATTA ATGAACCATA AACTTTAGGC ATAGACAAAG CTCCTTAACT TACGATTCCT 1380 

45 TTGGATGTTC ACCAATAATG CGAACTTCAC GATTTAATTC AATGCCAAAT TTTTCTTTGA 1440 

CGGT CTTTTG TACATAATGA ATAAGGTTTT CATAATCTGT AGCAGTTCCA TTGTCTACAT 1500 

TTAC CAT AAA ACCAGCGTGT TTGGTTGAAA CTTCAACGCC GCCAATACGG TGACCTTGCA 1560 

50 AATTAGAATC TTGTATCAAT TTACCTGCAA AATGAC CAGG CGGTCTTTGG AATACACTAC 1620 

CACATGAAGG ATACTCTAAA GGTTGTTTAG ATTCTCTACG TTCTGTTAAA TCATC CATTT 16 80 

SS 
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AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC 1860 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 1920 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 

CGCCGCTACC GGCTATTATC G CATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 204 0 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCT CATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 2220 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 2280 

CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 234 0 

CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 24 00 

AAATACGG C A TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 24 6 0 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 2520 

25 TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 2580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 264 0 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 2700 

30 

CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 2760 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 2820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 2880 

35 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 2940 

AAACTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3000 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATG AATCT GTTAAGGAGT GCAATCATGA 3060 

40 

AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 

45 AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 324 0 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 33 00 

ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 33 60 

50 CAATGAAAGT TAAGAAATTA AAAAAAGAAT ATATGACGCT TG CAAATG AG AAGAAGGATG 34 20 

CGATATATCA ATTAAAAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 34 80 

£5 
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AATTAGCTGA TAATAAAAGT GAAGCAACTA ATCTTACGAC AAAATTAGAA CATAATAATA 3600 

AAGCGTTAAG AGATACTGCG AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 3660 

GCGCGATTAA AAATCACATT ATGCCAATGA TTGAAAAGCA AATTACCGAT ATTAACCAAA 3720 

CTAATATTAG TGATAAGCAT GTTAATAATG CAAGGAAAAA CGCAATAGAA ATGTATTACA 37 80 

GTCTGCAGAA CTATTATAAT ACACGTATTG AAACAATAAA GGTTAGTGAG AAGTTATCAm 3 84 0 

AAGTCGATGT AGATAAGTTG CCGAAAAAGG GTATAGATAT AACTCACGGC GATAAAGCCT 3 900 

TTGAAAAAAA GCTTGAAAAA TTAGAAGAAA AATAACTATA ATCATTTTTC AAAGTTAAAA 3 960 

ATTTTGAATT TATGGTTAAC ATGTCAACTT ACTATGTGTA TAATGGTAAA CATTGATATT 4 02 0 

AACTATATGT ATAAAAATGT CACGCAGATG CTATTTAAAT GTGATAAATA TTTTTAGAGG 4 0Q0 

TGAATAGAGT GGCTATAAAG CTAAGTTCAA TTGACCAATT TGAACAGGTT ATTGAGGAAA 414 0 

2o ATAAATATGT TTTTGTATTA AAACATAGTG AAACTTGTCC AATATCGGCA AATGCGTACG 4 2 00 

ATCAATTTAA TAAATTTTTA TATGAACGCG ATATGGACGG TTATTATTTG ATTGTCCAAC 42 60 

AAGAACGCGA TTTGTCAGAT TATATTGCTA AAAAAACGAA CGTTAAACAT GAATCACCTC 4320 

25 AAGCATTTTA TTTTGTAAAT GGTGAAATGG TTTGGAATCG AG AC CACGGT GATATCAATG 43 8 0 

TGTCGTCATT AGCACAAGCA GAAGAATAAT GAAACTATAG GGTTGGAACA TTTTGCCTTA 444 0 

CACTACTAGA CGTGAATAGC ACAACTTAAA TTCGTGTGAA TCAGAGTAGT TTGGCTATAA 450 0 

TGATGTTCTG ACCTTTTATT TTATGTCACC TTTAGAAGCA GTTAAGTTAG TACTTTTTTA 4 56 0 

CAAACATATG TATAATATAT TCGAGTATTT TTATTGAAAa tATTTTGGAA AACGACGAAT 4 62 0 

CCAATAAGAA AATTTAAACA TGATTTGTAA GTTAGTTTAA TAGGAAATAT ATGCTAAACC 4680 

AAAAGAAGCA TATTGTTATT TACTGGAATA ATTAATAATC ATGT CATGTT AAATGTTAGC 4 74 0 

ATATAATCAC GAGATAAAAT CTAAAATTTA AGATTAATCT TTTATGAATA AAAAACGTAT 4 800 

CACAACAAAT AATAAAGTAA GGTGGTCAAG GTTATGAAAG TATTAGTAG C CATGGATGAG 4 860 

TTTCATGGAA TTATTTCAAG TTATCAAGCT AATAGATATG TTGAAGAGGC AGTTGCAAGC 4920 

CAAATTGAAA CTGCAGATGT AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 4 98 0 

45 GATTCTGTAT TTTTATGGcm ATCTGGGcaA AAGTATCGTA TACCAGTACA TG ATG CAG AT 504 0 

ATGAATGAAG TTGAAGGTGT TTACGGACAA ACTGATACAG GGATGACCGT TATCGAGGGG 5100 

AATTTATTTT TAAAAGGTAA AAAACCAATT GTTGAACGAA CAAGTTATGG TTTAGGAGAA 5160 

50 

ATGATTAAAC ATGCATTAGA TAACGACGCA AAACATGTTG TAATTTCACT AGGTGGGATT 5220 

GATAGTTTTG ATGCTGGTGC AGGTATGTTA CAAGCATTAG GTGCTCAATT CTATGATGAC 5280 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 5400 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 54 60 

AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 55 80 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 564 0 

CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 57 00 

AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 594 0 

TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6000 

CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATCGAAAG TATACGTAAA AAATATGAAA 6060 

AAT CACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 6120 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC GATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTG CTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 6240 

AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 63 00 

CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC 6360 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 64 20 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 64 80 

TTTCTAAAAT~XCX3TTTCTTAHC~G^ ATCATGATCA CGAATAATGT 6540 

ATTGCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6660 

AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCGTGGC 6720 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 6780 

45 CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 684 0 

GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6900 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6 960 

50 CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 7020 

GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 7080 
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GCGCCAAAAA TCCAACTGTT 
AGTAG CCATT TTCATCTAAA 
AATGTAACAA ATCCCATTGC 
GCGTATCAAT TGTCGTATAT 
ACCCCTTAAA CT CT ATT ATT 
C CAT ACAGTT GTTTGATACG 
AAAGCAATCG CACCTGAAAT 
CATTTGATAC TAAAAAACGA 
TGCCTGGCAC TATGAATATA 
TTAAGCCTAA AATTAAGCTT 
CCGTTAATTG GTAAATCGTC 
GTTTGGGTGC ATTGAAAATG 
GAAATAAATA AAATAG CATG 
GACACCAGCA CCGATTGCGA 
AAGTAATTCA CCCGCTAATA 
TGGCATGACA CTGGCTATAG 
TGTGGCTGCA ATGGATATGA 
TATATAGCGT t GCACAAAG C 
CCAACAATCT GATGCGACAC 
TGCAAAGAAA TTCGTTAAAA 
AGATTTAGCT TCATCAATTG 
TAAAGCGATT TTCTCTAAAT 
TCGATCGTTT AATGAAAAAA 
ACCATAACTA TGTGCGATAC 
TTCaAGTAAA ATTCTACCTG 
TGTGATTGAA TCTGG CAT AT 
TGaAGTTTAC AACTTGTTGT 
CTTGTATGGT TCAAATTTAA 
AATAATAGCA AAGGATTAAC 



GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 7200 

TCAGTTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 7260 

TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 7320 

CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 73 80 

CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 7440 

TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 7500 

CAGTG TAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 7560 

GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

ATTAC CGGTC GTTTATATCT GCGACTCATA GTATGACTCA 7680 

CCCAAAAATG AAGCGCCAAC TTTTCCAAAC TCTAAAT CTA 7740 

CATGCAATGG CACCCACAAA T CCA CATG CT ACTAAGAGGC 7800 

ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

CTTTAACAGT CCTTCCTTAA ATGATTAATA AAA CGATTG C 7920 

ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 7980 

AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 804 0 

TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TGAATGTTAA AAATGCGGAT CCGCCAGCAA TGACTGCAAT 8220 

CACCAAACAT AAATAGGAAG AAGCCACATG CAATGGCAGC 8280 

AAGAATATTG TAATGATGCA TGCTGTAAAT GAATAAATTC 834 0 

TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 8400 

CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 84 6 0 

TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 8520 

GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 8580 

CAATTAATAC AACATCAATC ACTTTGTTTT CAT CTATAAT 864 0 

CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 8700 

TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8760 

ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 8880 
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TTTACGCTGT 


GATTTTGGAT 


CGTCATCTGT 


TAAATAACCA ACACCGATAG 


ACACTGACAA 


9000 




TTTAATAACT 


TCTTTGTTTG 


GTAAATGGAA 


TGATGATTTT TCAACACCCG 


AACGAATATT 


9060 


£ 


TTCAGCTAAT 


TTAACACTTT 


GATCAAGTGA ATAATTGTGA ATGACAACTG 


AGAACTCTTC 


9120 




G CCACCATTT 


CTAAAAATTT 


I AAAIiGAn 


CGG CA CAT AG TTl-lTAAGTA 


ATTGAGACAT 


9180 


10 


TTGTTTTAAT 


ACAG CATCAC 


i oA J. 1 1\jTG 


TGAGTAGGTA TCATTGaCAT 


CTTTAAATCC 


9240 


ATCGATATCG 


ATTAATAATA 


A 1 LxL-CxATACT 


TTGATGTTCT TTTTCAGCTT 


TTCGTGAAAT 


9300 




TTCATTTAAA 


TGTCTATCAA 


Ail LTTTTAC 


ATTACCTAAG CCTGTTAAGT 


AATCATATTT 


9360 


15 


ATCTTCGTTT 


TCATAACGAT 


1 1 AL-CjAGTCjA 


GAAGAAATGC CAAATATCGA 


CAAATGTTAT 


9420 




CGCTGAAGCT 


AAAGTGATAA 


TTAATGAAAT 


TGGTATTAAA ATGATAACTT 


CCGATAGTGT 


9480 




GTAAATAGGA 


CTCACTAACG 


CGACACCAAA 


TAAAATGATT ATTGTAACAA 


CATTAAGTAT 


9540 


20 


TAATAATGAT 


AG CA CAT CAT 


TTTGTTTTAA 


AAATGGTCCA AT AG CACTTG 


TTACTGCAGC 


9600 




AATAACAATC 


AACGTAACAC 


CGTACATAAT 


CGAGTTGTTA AATACTACAA 


TTTCAACAAT 


9660 




TGCTACAATT 


ACTGTGG CAG 


ATAATGTATA 


GACCATATTT GTAAATCTAC 


CTAAAAACAA 


9720 


25 


TAAAGGAACG 


AATGTTAAGT 


GAATTAAATA 


ATCTTCACGA TAAGGGATAG 


GGTAGACAGA 


9780 




TAATAATAAT 


GATACGATTG 


TCATTAAAAC 


AGTGACATAA G CCTTAG AAA 


AAAC 


9834 




(2) INFORMATION FOR SEQ ID NO: 3 8 









(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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-<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



TCTCAATCAG 


ATGAAAAATT 


GCATATCGTA 


GGTTTTACAG 


AAAGTGCAAA 


ATATAATGCG 


60 


TCATCAGTCA 


TTTTCACGAA 


TGACGCTACC 


ATTGCCAAGA 


TCAATCCTAG 


ATTGACTGGA 


120 


GATAAAATTA 


ATGCAGTTGT 


TGTACGTGAT 


ACAAATTGGA 


AAGACAAAAA 


ATTAAACCAA 


180 


GAGCTTGAAG 


CGGTAAGTAT 


TAATGACTTT 


ATTGAAAATT 


TACCAGGTTA 


TAAACCACAG 


240 


AACTTAACAT 


TAAACTTTAT 


GATTTCATTC 


TTATTTGTCA 


TTTCAGCTAC 


AGTTATAGGC 


300 


ATTTT CCTAT 


ATGTCATGAC 


ATTACAAAAG 


ACGAGTTTAT 


TTGG CATATT 


AAAAGCTCAA 


360 


GGATTTACGA 


ATGG CTATTT 


GGCGAATGTG 


GTAATTTCGC 


AGACGGTCAT 


ATTAGCACTA 


420 


TTTGGTACGG 


CATTTGGCTT 


ACTGTTAACA 


GGCGTTACAG 


GTGCATTTTT 


AC CTGATGCA 


480 
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TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 600 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 660 

GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 720 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 780 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 84 0 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AG CTACAAAC 90 0 

CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 960 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 1020 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 1080 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 114 0 

ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 1200 

TTATTGTTAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 1260 

ATGGCGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 1320 

GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 13 80 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 1440 

TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 1560 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAG CAA 16 20 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 1740 

TAACAGCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 1800 

GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT ie60 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 1980 

TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 2040 

GTGGTATTAT TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 2100 

TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AG CATTTAGT TCGGCATTTA 2160 

TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 2220 

GTATTGAACC TTTATCACAA G CAGAT ATTG TATCAGCCAA TCCAATTCCA ATCTATATTA 2280 
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ATGCGACAGG TACAGCTACA CCGATTGCAG GATTTTTAGT TATGTTTGGA TTTAATCATC 2400 

CGACGACAAT TGTGATTTAT GGTGTAGTAA TGGCGATTGT AGGTGCGCTT GCAGGTTATC 246 0 

TTGGTTCAAT TGTATTTAAA AAATATCCAA TTGTTACTAA GCAAGACATG ATTAATCGAG 2520 

GTGCAGTAGA CGCATAGCAT CATCATATTG AATAGTAAAA ACAAATAAAA CATAGTAACG 258 0 

TGATTCAGTC GATGTAACAG TCGATAATGA GTCACGTTTT TTTATAGAAA AATACAAGAC 264 0 

ATAAAAATGT CATAATTTAT TGTCGACAAA TATCATACTG TATAAACATT TATCATTTTC 27 00 

TCAAGTACCT TTTACACGAT GGAATGAACT TACTTTTTAC GAAATTATGC GTATTTTATA 276 0 

AACAAATATC ATTGATATAA CGGTAAATGT AAGCGTTTAC AACAGAAATA ACAGCATGCT 2820 

ACGATATTTT TGTAAATTCA CTGATTCAAG TATTTTAAGT CAATATGAGG AGGGATGTTA 2880 

TGAGCGATTC TGAGAAAGAA ATTTTAAAAA GAATTAAAGA TAATCCGTTT ATTTCACAAC 2940 

GTGAACTTGC TGAGGCAATT GGATTATCTA GACCCAGCGT AGCAAACATT ATTTCAGGAT 30 00 

TAATACAAAA GGAATATGTT ATGGGAAAGG CATATGTTTT AAATGAAGAT TATCCTATTG 306 0 

TTTGTATTGG CGCAGCGAAT GTAGATCGTA AGTTTTATGT GCATAAAAAT TTAGTTGCAG 3120 

25 AAACATCAAA TCCTGTAACG TCAACACGCT CTATTGGTGG CGTAg CAAGA AATATTG CTG 3180 

AGAACTTAGG TAGGCTTGGC GAAACGGTCG CTTTTTTATC TGCTAGTGGA CAAGATAGTG 324 0 

AATGGGAAAT GATTAAACGA TTGTCCACAC CATTTATGAA TTTGGATCAT GTTCAACAAT 3300 

30 TTGAAAATGC GAGTACAGGT TCATATACAG CTTTAATTAG TAAAGAAGGC GACATGACAT 3360 

ATGGCTTaGC AGATATGGAA GTGTTTGACT ACATTACGCC TGAATTTTTA ATTAAGCGTT 34 20 

CACACTTATT GAAAAAGGCT AAGTGCATTA TTGTAGATTT GAATTTAGGC AAAGAGG CAT 34 80 

— TAMlCVi^^ —3540 



20 



35 



40 



CTTCCCCAAA AATGAAAAAT ATGCCTGATT CATTACATGC TATTGATTGG ATTATCACGA 3600 

ATAAAGATGA AACAGAAACA TACTTAAATT TAAAAATAGA AT CTACTG AT GATTTAAAAA 3660 

TAGCTGCTAA ACGCTGGAAT GATTTAGGTG TTAAAAATGT TATTGTGACA AATGGCGTGA 3720 

AAGAACTCAT TTATCGAAGT GGTGAGGAAG AAATCATTAA GTCAGTTATG CCAT CAAATA 3780 

GTGTGAAAGA TGTTACAGGT GCAGGCGATT CATTCTGTGC TGCAGTAGTG TATAGCTGGT 3840 

TAAATGGGAT GTCTACTGAA GATATATTAA TTGCTGGTAT GG TTAACGCA AAGAAAACGA 3900 

TAGAAACGAA ATATACAGTT AGGCAAAACC TAGATCAACA GCAACTTTAT CACGATATGG 3960 

50 AGGATTATAA AAATGGCAAA TTTACAAAAG TATATTGAGT ATTCTCGAGA AGTTCAGCAA 4 020 

GCACGGGAGA ACAATCAACC GATTGTAGCA TTAGAATCAA CAATTATTTC GCATGGTATG 4080 
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GCCATTCCAG CAACCATAGC CATTATAGAT GG CAAAATTA AAATTGGTTT AGAAAGCGAA 4 2 00 

GATTTAGAAA TACTGGCAAC TAGTAAAGAC GTTGCTAAAG TATCTAGAAG GGATTTAGCA 4260 

GAAGTTATTG CGATGAAGTG TGTTGGTGCT ACTACTGTAG CGACGACGAT GATATGTGCT 4320 

GCAATGGCTG GTATTCAATT TTTTGTTACA GGAGGTATTG GGGGCGTCCA TAAAGGTGCA 4380 

GAACATACGA TGGACATTTC AG CAG ACTT A GAAGAACTGT CTAAAACAAA TGTCACTGTT 444 0 

ATCTGTGCAG GTGCCAAATC AATTTTAGAC TTACCTAAGA CGATGGAGTA TTTAGAAACA 4 500 

AAAGG CGTTC CAGTTATTGG ATATCAAA CG AATGAATTGC CAGCATTCTT CACTCG CGAA 4 560 

AGCGGTGTTA AGTTAACAAG TTCGGTTGAA ACGCCAGAAC GACTTGCTGA CATTCATTTA 4 62 0 

ACAAAACAGC AGTTAAATCT TGAAGGTGGC ATTGTTGTTG CTAATCCAAT TCCATATGAG 4 6 80 

CATGCCTTAT CAAAAGCATA TATTGAGGCA ATCATAAATG AAG CTGTTGT TGAAGCGGAA 4 74 0 

AATCAAGGTA TTAAAGGTAA GGACGCCACA CCGTTCTTGT TAGGGAAAAT TGTAGAAAAA 4 BOO 

ACGAATGGTA AAAGTTTAGC AGCAAATATA AAACTTGTTG AAAACAATGC GGCGTTGGGT 4 860 

GCTAAAATTG CTGTCGCTGT TAATAAATTA TTGTAGGTGA TGATACATGA ATATTTTATT 4 920 

25 CGCTATCACA GGGATAGCAT TTGCACTATT TGTTGCGTTT TTATTCAGTT TTGATCGTAA 4 980 

AAAAATAGAC TTCAAAAAGA CGTTAATAAT GATATTTATT CAAGTGTTGA TCGTGTTATT 504 0 

TATGATGAAC ACAACGATTG GTTTGACAAT TTTAACTGCA CTAGGTTCAT TTTTTGAAGG 510 0 

GCTAATAAAT ATTAGTAAAG CAGG CAT AAA TTTTGTTTTT GGAGATATAC AAAATAAAAA 516 0 

TGG CTTTACG TTCTTTTTAA ACGTATTACT GCCATTAGTT TTTATTTCTG TATTAATAGG 522 0 

CATCTTTAAT TATATTAAGG TATTACCATT T ATT AT CAAA TATGTAGGTA TCGCTATTAA 52 8 0 

TAAAATAACT AGAATGGGGC GCTTAGAAAG TTATTTTGCT ATTTGAACAG CAATGTTTGG 534 0 

GCAACCAGAA GTATATTTAA CAATAAAAGA T ATT ATT CCA AGATTATCTA GAGCGAAATT 54 00 

ATATACAATT GCGACGTCTG GTATGAGTGC TGTTAGTATG GCAATGCTAG GTTCATATAT 5460 

G CAG ATGATT GAACCCAAGT TCGTAGTTAC AGCAGTAATG TTAAATATTT TTAGTGCGCT 5520 

TATCATCGCC AGTGTAATCA ATCCCTATAA ATCTGATGAT ACTGATGTTG AAATTGATAA 5580 

45 CTTAACGAAA TCCACAGAAA CTAAAACATT GAATGGAAAA ACAGGAAAAC CTAAGAAAGT 5640 

TGCCTTTTTC CAAATGATTG GTGATAGTGC GATGGATGGG TTTAAAATCG CTGTTGTAGT 5700 

AGCCGTAATG TTGTTAGCAT TTATTTCATT AATGGAAGCA ATTAATATCA TGTTTGGTAG 576 0 

SO TGTTGGTTTG AACTTTAAAC AGCTTATTGG CTATGTGTTT GCACCAATCG CATTCTTAAT 5820 

GGGGATTCCA TGGAGCGAAC TGTTCCAGCT GG CTCTTTAA TGGCGACTAA ATTAATTACA 5880 
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CAAGGTATCA TTTCAGTTTA CTTAGTAAGc TTCGCTAATT TTGGTACGGT TGGTATCATC 6000 

GTAGGTTCAA TTAAAGGCAT TAGTGATAAA CAAGGAGAAA AAGTTGCATC CTTTGCAATG 6060 

AGGTTGCTAC TTGGTTCAAC TCTAGCTTCA ATCATTTCAG GATCAATCAT TGG CTTAGTA 6120 

TTGTAAATGA ATCGAAGTAC CTAAATTAAA TTCATGGCAA AGCTAAACCC CGTCACCAAG 618 0 

TTGGCGCAAC AGCGc ATg c A TAACTTAGTG ACGGGGTTTT ATCATAACAA TCTACTTTTT 624 0 
CGTAGCCGTT TTTGAAATGT ATGTTGATGG TTTATCTTTT TCAAAAATTG TTAATCCCGT 
TATATCTTTT TTATGTTTTG AAGGGACAAT GAAGCTAAGT ATATAAGCAA AGACAAAAGC 

T5 AACTGTAAAT GAAATGGTAG ATACATAGAA AGGTGAGTTA CCTTTGCCAA CACCATTATA 642 0 

GACATAAGCA AAGATGATAC C CAATATT AA TCCACAAATA ACACCGAATG TATTCGTACG 6480 
TTTAGTGAAA ATACCAACTG CAAATACACC AGCCAATGGA ACGCCGAATA ATCCAGTCAC 
20 AAACAAGAAT AAATCCCATA AGTCATTTGA ATTAGAAGCA ATTAAGTATA GTGACATTCC 
AAAACCGAAA ATACCTGCAA TGATAATAAT GAAACGTGCA AAGTTAACTT CGTGTCGCTC 
GCTACCTTTT CCGAAGAAGC GTTGCTTAAT GTCGATTGAA ATACAAGCAG ATATAGAATT 

TAAACTAGAT GAAATGGTAG ACTGTGCAGC GGCGAAAATG GCTGCAATAA GTAATCCTGC 6780 

TACAAATGGT GGCATCTCAG TCAAAATGAA ATATGGCACT ACAGATGATG TATTGAAGCC 6 84 0 

TTTTGGTAAA ACAGCTTCAT GTGTATAAAA TGAATACAGC ATTGTACCCA TACCATAAAA 6900 

TAAGGGTGCT GAAATTAAAG CTAGGATACC ATTTGT CCAT AACGATTTAT TTGTTTCTTT 6 960 

TAAACTATCA GAAGCTTGAT AACGCTGCAC GACGTCTTGA CTCGCTGTGT ATTGATACAA 7020 

GTTGTTGAAA ATATTTCCTA GGAAAATAAT TGGAATGGCA GCTGCCGCAG TATTTAGTTT 7080 

CCAATTGTCT GCACTAATTA ATTTTTTGTG CTCAATCGCA T CTGCAAAG A CAGTGCCGAA 714cT 

ACCGtCTTTA ATGTTCACAA CACCTAGAAT AATAATAACT AAAGCGCCGC CTAATAAAAT 7200 

4Q GACGCCTTGA ATGAAATCAC TCCAAACCAC ACCTTCGAAA CCACCTAAAA ATGTATATAA 726 0 

AATACATAGT AAACCAACGA GTGATGCAAC GATATAAGGG TTCATGTCTG ATACAGATGT 7320 

GATTGCTAAT GTTGGTAAGT AGATAACAAT TGCAACACGC CCTAAATGGT AAACGACAAA 7380 

45 TAATAATGAG CCAATGACAC GTATGCTAGG GCCAAATCTA GCTTCTAAAT ATTCATATGC 744 0 

AGATGTTACC TTTAACTTTT TAAAGAAAGG GACATAGAAA TAAATAAGTA ATGGAATAAT 7500 

TGCGACGATA GCAATGTTAC CAGCGATATA TGACCAATCT GTTAAAAATG CTTTCTCTGG 7560 

TGTCGACATA AATGTAATCG CACTTAACGT AGTAGCATAA ATTGAAAAGC CAACTACCCA 7620 

AGATGGCAAG CGACCACTTG CGGTAAAGAA ACTATTGGTA CTTTGGCTCG CGCGCTTGGT 7680 
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TGTGCCAAAT CCAACTTCTT TCATGGGCAA CATCCCCTTT ACAATGTATT GATTCTTTGA 7800 

TGTCTATAAA TCGTATTTTG CAATGAGTTG ATCTAATGTT TGTCGATGTG CTTCGTTAAA 7860 

AGGTTTGAAA GGTCTTTTCG GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTTCTTT 7920 

CAATGTTGGA TAAATCCCCA TTGATAACAC TGTTTCGATA ATGTCGTTTG AATCATGTTG 7980 

CAGTTGGTAA GCTTCTTGAA TTTGACCTTG TCGTGCTAAG TCGAAGATTT TTCTTGCACG 804 0 

GCGACCATTA ACGTTATATG TAGAACCAAT TGCACCATCT ACGCCAGAAA TCGTAGCTTG 8100 

AACTAACATT TCATCAAAGC CAGATAAGAT TAATTTGTCT GGGAATGCTT TTCTAATACG 8160 

TTCGAGTAGG AAGAAGTTTG GCGCTGTATA TTTAACACCA ACAATTTTTT CATGATTAAA 82 20 

TAGCTCG CTG AATTGTTCAA TAGAAATATT GACACCTGTT AAATCTGGTA TTG CATAAAT 8280 

AATCATATTG TTCTGAGTTG CTTCGATAAT ATCGAAATAG TAATCTCTAA TTTCTTCAAA 834 0 

AGTAAATGGA TAGTAGAATG GTGTTACGGC AGAAAGTGCA TCATAACCGA GTTCTGTGGC 84 00 

ATATTTTCCA AGTTCAATGG CTTCATTTAA ATCTAACGAA CCTACTTGAG CAATCAATTT 84 60 

CACTTTATCC CCAACTGCCT CTTTGGCAAC CTTGAAAACT TGCTTCTTCT GCTCTGTATT 8520 

TAATAAAAAG TTTTCGCCTG AGCTACCATT T A CAT AAAG A CCGTCTAATT CTTCAGTTTC 85 80 

AATGGCATTT TGAGCAATTT GTTTAAGTCC TTGTTCATTT ACTTGACCAT TTTCATCAAA 8640 

AGGAACGAGT AACG CTG CAT ATAAACCTTT TAAATCTTTG TTCATTATGA AGTCCCTCCA 8700 

AAAATCATTT GATAATATAG TTTACAGCTA TAATTGTAAA CGCTATCATA AAATGTAACA 8760 

ATATCTTTTT GAAAATTGTA GTCATATTTA TGTATAATTA ATGAAAATGT TTTTCAAAAT 8 820 

CAATAGAAAT GGAGTGAGTA AGGTGTATTA CATCGCAATC GATATTGGAG GCACTCAAAT 8880 

TAAATCGGCA GTTATTGATA AGCAATTGAA TATGTTTGAC TATCAACAAA TATCAACGCC 894 0 

GGACAACAAA AGTGAGCTTA TTACTGACAA AGTATATGAG ATTGTAACAG GATATATGAA 9000 

GCAATATCAG TTGATC CAAC CTGT CATAGG TATTTCATCA GCAGGCGTTG TTGATGAACA 9060 

AAAAGGCGAA ATTGTATACG CAGGGCCAAC CATTCCGAAT TATAAAGGTA CTAATTTTAA 9120 

GCGATTATTA AAATCACTGT CTC CTTATGT CAAAGTAAAA AATGATGTAA ACGCTGCATT 9180 

ACTAGGCGAA TTGAAATTAC ATCAATATCA AGCAGAACGG ATCTTTTGTA TGACGCTTGG 924 0 

TACAGGCATT GGGGGTGCGT ACAAGAATAA TCAAGGTCAT ATTGATAATG GTGAGCTTCA 9300 

TAAGGCAAAT GAAGTTGGGT ATTTATTGTA TCGTCCAACT GAAAATACAA CGTTTGAGCA 9360 

ACGTGCTGCA ACGAGTGCAT TGAAAAAGCG CATGATTGCC GGAGGATTTA CGAGAAGCAC 942 0 

ACATGTGCCA GTATTGTTTG AAGCAGCTGA AGAAGGTGAT GATATTGCAA AACAAATATT 9480 
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AGGGCTTATA TTAATTGGGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 
CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 

5 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 
CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 
TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA CAGTGTGCCA CGGGCAAATT 

10 

AAATTGAAGA AGCTGAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 
ATTAAAAAAG CACCTAG CAA CTCGTTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 

15 GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 
GTCGCAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC GCGATTGATC CGTTGTTAAA 

20 AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 

TATGCACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 
TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 

25 TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA T CAT ACG AT A ATAAAATTCA 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 
TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 

30 

TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 
ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTATGACTA 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCGCATGGG CTAAGGAGTT AATCGTTGAA 

35 





AAGGCATCGC 


TATAACCATT 


TTGTCTTATA 


TAATTGACGA 


TGCGTTTATC 


AGTTTTTGTA 


10740 




AATAAATGTT 


GATAACGTTG 


AACACGATTC 


TCAAATTTCA 


TTGTGTCACC 


CCTTCATCTT 


10800 


40 


AATGATTACT 


ATTATATATG 


AAAAATATTT 


TCAAGATAGT 


AAAAAGCATT 


GATAAAAATT 


10660 




ATCTTAATGA 


TATATTGTAA 


ATGACTTTAC 


GTGAAAAAAC 


GACTTATGGA 


GTGAGGAATA 


10920 




ATGTTACCAC 


ATGGATTAAT 


AGTATCTTGT 


CAGGCACTAC 


CAGATGAACC 


ATTGCATTCA 


10980 


45 


TCT1TTATTA 


TGTCGAAAAT 


GGCATTAGCT 


GCGTATGAAG 


GTGGTGCTGT 


TGGTATTCGC 


11040 




GCAAATACTA 


AGGAAGACAT 


TTTAGCAATT 


AAAGAAACGG 


TAGATTTACC 


AGTTATTGGC 


11100 




ATTGTGAAAC 


GTGACTATGA 


TCACTCAGAT 


GTTTTCATTA 


CTGCAACGTC 


AAAAGAAGTT 


11160 


50 


GATGAACTGA 


TAGAAAG CCA 


ATGTGAAGTC 


ATTGCATTGG 


ATGCAACGTT 


ACAGCAACGT 


11220 




CCGAAAGAAA 


CGTTAGACGA 


ATTAGTATCA 


TATATTAGAA 


CACATGCACC 


GAACGTTGAA 


11280 
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TATATTGGCA 


CGACGTTACA 


TGGCTATACT 


AGTTATACGC 


AAGGACAATT 


ACTTTATCAA 


11400 




AATGACTTCC 


AATTTTTAAA 


AGATGTACTA 


CAAAGTGTTG 


ATGCAAAAGT 


TATTGCGGAA 


11460 


5 


GGTAATGTCA 


TTACACCGGA 


TATGTATAAA 


CGTGTGATGG 


ACTTAGGCGT 


TCATTGTTCA 


11520 




GTCGTTGGTG 


GTGCGATAAC 


ACGACCAAAA 


GAAATTACGA 


AACGTTTTGT 


TCAAATTATG 


115B0 


10 


GAAGATTAAA 


TGATAACGAT 


AAAAAAACGA 


GATGACCATC 


ATTAATTAAA 


GGCACCTAAT 


11640 


TATCTTAGGT 


GGCTGAATGA 


ATGTAATGGG 


TTCATCTCGT 


TTTGTTTGTT 


TATGATAGTG 


11700 




ATTTTATTTT 


CAACTTTATC 


CAAAAATAAG 


TAAAGCGACG 


GGGATGGTGA 


TTAATAGCGA 


11760 


15 


CAACGCCACG 


CGTAAAAACC 


AAATGATGAT 


GAGTTTCCAG 


ACAGGTATTT 


TAATTTCAGT 


11820 




TGCTAGTATA 


CATGGCACTA 


ATGCTGAGAA 


AAAGATAATG 


GCTGATACGC 


TTACTACACC 


11880 




GACGACAAAT 


TTAGTACTCA 


TTGCAGCTTT 


AGTTACTAAC 


AAAGATGGTA 


GAAACATCTC 


11940 


20 


TACAATAGAA 


AckCTGACGC 


TTTTGCTAGT 


AAAGCCTGAT 


CAGCAATTGG 


GAAAATATAA 


12000 




ATAAATGGAT 


AGAAGATATA 


GCCAAGCCAA 


TCAATGAATG 


GTGTATAGTT 


CGCTACAATC 


12060 




AGTCCTAAAA 


AACCAATCGA 


TAATATAGAA 


GGTAAAATAC 


CAACAGTCAT 


TTCTAAACCG 


12120 


25 


TCTTTCAAAT 


TGTCCCAAAC 


GTTCTTCACG 


AGAGATGGTG 


TTAATGCATT 


TTGTTTCATC 


12180 




GCCTCTGCAT 


ATGCAGTTTT 


CAGTCTGCTT 


CCTTCAATAG 


CAACTTCTTG 


TTCTCCTTCT 


12240 




TGTCCGTTAT 


AATATTCTGT 


TGATTCATTG 


CTGATTGGCG 


GTAGCCATGC 


AGTAATTGCA 


12300 


30 


GTCACGACAA 


ATGTGATGAC 


TAAAGTTATC 


CAAAAGTATA 


AATTC CAATG 


CGGCATTAAT 


12360 




CCTAAAGTTT 


TAGCAACGAT 


AATCATAAAA 


GTTGCTGAAA 


CTGTTGAAAA 


GCCAGTCGCA 


12420 




ATAATCGTGG 


CTTCTCGTTT 


GTTGTACATC 


CCTTGCTTAT 


AGACACGATT 


AGTAATCAAT 


12480 


35 


AATC CTAAGG 


AATAACTGCC 


GACAAACGAA 


GCCACTGCAT 


CGACAGCGGA 


TTTTCCTGGT 


12540 




GTTTTAAAAA 


TAGGTCTCAT 


AATAGGCTCC 


ATATAAACAC 


CGACAAATTC 


TAATAAGCCA 


12600 


40 


TAGCCCACTA 


ATAAAGAAAG 


CG c AATTGCA 


CCTACTGGAA 


TTAAGATACT 


TAATGG CATC 


12660 


ATTAATTTTT 


CAAACAAAAA 


CGGACCATAG 


TTAGCTTTAA 


ATAGTATTGA 


TGGACCGATT 


12720 




TTAAATACAT 


ACATTATACC 


GATCATTGCA 


CCTGCAACTT 


TAAATAATGT 


AATG AC CAAG 


12780 


45 


X X i. VJ J. VjXl X X VJ 


r\r\\j i V^rv x Afvt 


t\\3 1 1 1 1— 


AL1A1 lublA 




AATTAAAATC 


12 84 0 




ATAATCAGTG 


CAACATAGGG 


CATAAGTGGA 


CCTATGATTG 


AGCGAATGGC 


TAGATGAACA 


12900 




TGATCGACGA 


AAATAGTGTT 


GTTACCATTA 


ATCGTAAAAG 


GAATAAAGAA 


ACATAGTATG 


12960 


50 


CCCACTAAAC 


TATAGACAAA 


AAAACGCCAT 


GCACTTGGTT 


GTTGTGCATT 


AGAATGATAT 


13020 




TGATTCATTA 


AAGCAACCCC 


TTTGTTTAAA 


TGAATACACA 


AAACTGTATG 


ATGCATCTTC 


13080 
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ATAGTTTGAA 


TTATTTTCAT 


ACCAATACAA 


ATTAACTAAT 


TATATATAGA 


TTGAAACTAT 


13200 




ATTACTTAAT 


AAAATATTTA 


TCTTAAATGT 


TGTTGTGTTG 


ATTCAACACC 


ACAACTAAAA 


13260 


s 


GTGTTTATAA 


ATTATTTGGA 


AATACACATA 


TTTGTAAATG 


ATTAGTATCG 


ATTTAATATC 


13320 




GTATTATTAA 


ATTTTTATTA 


ATTTTGTAGT 


CTTAATCmAA 


AAATAATATA 


TGTCATGTTA 


13380 


10 


TATTGAAGGT 


GCAGTTGTTT 


TTCATTCTCA 


AGAGGGGGTC 


AAAAAAATAC 


TTTTGAGGTG 


13440 


ATTATATGTT 


AAGAGGACAA 


GAAGAAAGAA 


AGTATAGTAT 


TAGAAAGTAT 


TCAATAGGCG 


13500 




TGGTGTCAGT 


GTTAGCGGCT 


ACAATGTTTG 


TTGTGTCATC 


ACATGAAGCA 


CAAGCCTCGG 


13560 


15 


AAAAAACATC 


AACTAATGCA 


GCGGCACAAA 


AAGAAACACT 


AAATCAACCG 


GGAGAACAAG 


13620 




GGAATGCGAT 


AACGTCACAT 


CAAATGCAGT 


CAGGAAAGCA 


ATTAGACGAT 


ATGCATAAAG 


13680 




AGAATGGTAA 


AAGTGGAACA 


GTGACAGAAG 


GTAAAGATAC 


GCTTCAATCA 


TCGAAGCATC 


13740 


20 


AATCAACACA 


AAATAGTAAA 


ACAATCAGAA 


CGCAAAATGA 


TAATCAAGTA 


AAGCAAGATT 


13800 




CTGAACGACA 


AGGTTCTAAA 


CAGTCACACC 


AAAATAATGC 


GACTAATAAT 


ACTGAACGTC 


13860 




AAAATGATCA 


GGTTCAAAAT 


ACCCATCATG 


CTGAACGTAA 


TGGATCACAA 


TCGACAACGT 


13920 


25 


CACAATCGAA 


TGATGTTGAT 


AAATCACAAC 


CATCCATTCC 


GGCACAAAAG 


GTAATACCCA 


13980 




ATCATGATAA 


AGCAGCACCA 


ACTTCAACTA 


CACCCCCGTC 


TAATGATAAA 


ACTGCACCTA 


14040 




AATCAACAAA 


AGCACAAGAT 


GCAACCACGG 


ACAAACATC C 


AAATCAACAA 


GATACyXf ATP 


1 A. 1 ft ft 


30 


AACCTGCGCA 


TCAAATCATA 


GATGCAAAGC 


AAGATGATAC 


TGTTCGCCAA 


AGTGAACAGA 


14160 




AACCACAAGT 


TGGCGATTTA 


AGTAAACATA 


TCGATGGTCA 


AAATTC CCCA 


GAGAAACCGA 


14220 


35 


CAGATAAAAA 


TACTGATaAT 


AAACAACTAA 


TCAAAGATGC 


GCTTCAAGCG 


CCTAAAACAC 


14280 




GTTCGACTAC 


"AAATGCAGCA" 


"GCAGATGCTA" 


"AAAAGGTTCO" 


"ACCACTTAAA" 


GCGAATCAAG 


14340 




TACAACCACT 


TAACAAATAT 


CCAGTTGTTT 


TTGTACATGG 


ATTTTTAGGA 


TTAGTAGGCG 


14400 


40 


ATAATGCACC 


TGCTTTATAT 


CCAAATTATT 


GGGGTGGAAA 


TAAATTTAAA 


GTTATCGAAG 


14460 


AATTGAGAAA 


GCAAGGCTAT 


AATGTACATC 


AAG CAAGTGT 


AAGTGCATTT 


GGTAGTAACT 


14520 




ATGATCGCGC 


TGTAGAACTT 


TATTATTACA 


TTAAAGGTGG 


TCGCGTAGAT 


TATGGCGCAG 


14580 


45 


CACATGCAGC 


TAAATACGGA 


CATGAGCGCT 


ATGGTAAGAC 


TTATAAAGGA 


ATCATGCCTA 


14640 




ATTGGGAACC 


TGGTAAAAAG 


GTACATCTTG 


TAGGGCATAG 


TATGGGTGGT 


CAAACAATTC 


14700 




GTTTAATGGA 


AGAGTTTTTA 


AGAAATGGTA 


ACAAAGAAGA 


AATTGCCTAT 


CATAAAGCGC 


14760 


SO 


ATGGTGGAGA 


AATATCACCA 


TTATTCACTG 


GTGGTCATAA 


CAATATGGTT 


GCATCAATCA 


14820 




CAACATTAGC 


AACAC CACAT 


AATGGTTCAC 


AAGCAGCTGA 


TAAGTTTGGA 


AATACAGAAG 


14880 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 15120 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 153 00 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGG CAA GTTAAACCAA 153 60 

TCATACAAGG ATGGG AT CAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

20 GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AG TAG CAT CA CAGTGTTGAA 15600 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

25 ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GG AACATGAT TCATGATATG 15900 

TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TG CGCAATG A AAACAGGTGG 15 960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16 020 

AAT CATATTT ATCCATTTAC CTGTG CCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16080 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 16140 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 162 00 

ACTATAACCA TAAAAATCAA TAACACCATC AATAT CTCTG TCTCGTGCAA TTAATAGACT 16260 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 163 80 

4S ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16560 

SO AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16620 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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w 



15 



20 



25 



30 



35 



CATCATTTTA 
CATAGAAAAT 
AGGTTCTGTG 
AGCAGACTCT 
TTCATAGACA 
TTGATAAATA 
GACCCAATTG 
TTCTTCTGGC 
TTCTTGTACC 
ACCGTAATGG 
TTGTGCAATC 
GCGATACTGC 
AATCGGTGAA 
TTGATTTGAA 
CATCGTAGAT 
ATATTCAAAC 
TGAATTAGAG 
TGATGTGTTT 
TAAGTGGATT 



ACAATATCTT 
TCAAGATTGA 
AGAAAAGGCG 

GGAGAATTAA 
ATTTGGTTAA 
TGTCGACCCC 
ATTAATTGGT 
GTTGCTCGAA 
GCACGCATAA 
TCTGTACGTT 
GAAATTTCCA 
TGAATGATGC 
TGCAATGTCA 
AAACGACCAG 
GCCATGTTAG 
AATTGACCAT 
CGACGTGCAG 
ACGGTCATTG 
GATTGTAAAA 



TAAAAGCAGC 
TATCATGTGG 
AAGACATGCC 
TCCCGCCACT 
CTGGTCGACC 
AGCTAGCGAT 
TGAACTCGTC 
ATCCTAAAAT 
CTTCTAAACA 
TATTCGAAAA 
CACCATCAAA 
TATTGATTTT 
TAGGGCTTGG 
CATG CG CTAg 
TTAATCCAGG 
AAGGTTCAAT 
CATAAGCCAA 
GTGATAATAC 
GTGGTTTGTA 



ATGTGGAATG 
TCGCTGTTCA 
GACCATATCT 
TGCAATTAAA 
GAAATGATCA 
TGGTAAGTAT 
AATGGTATAT 
AAAATTGTCA 
TAATCTTGCA 
AGTTGAGAAA 
ACCTGCTTTA 
CTCATGAGAC 
TCCATACACC 
CTGGATAATA 
GATACAAGCA 
GTAAGCAGCG 
GTCTTCTTTT 
AAAGCGATTC 
TCGGTACATA 



GCTAAATCTT 
GCAAGTTTAT 
GCATGTTGTA 
GGGATACGAC 
CCTGGTGTAC 
TGGATGTTTG 
CCTAAATCAC 
GGTGCTTCTT 
CGATTTTTTA 
AATGTTTGAA 
ATCGCGCGTA 
ATGGCGATAA 
TTTCCAAAAT 
GCGAGGCTAC 
TCATGATCAA 
CCGGTGACTT 
GTAATATAGC 
GAAATTTTGA 
CTATGATTCC 



CTAAATCTGC 
GCACAAAGTC 
AAGCATCTAA 
CTGCTAAATG 
GAGACGTATT 
AAACGTCCAT 
TGCCTCTGGT 
TATCAATCAC 
ATGAGTCGGC 
TCAGCAAACG 
ATGTAGCATC 
CATCGTGTTC 
TTAAAATGGC 
CATGTTGTTT 
TATTAAAGCC 
GCATTCCAGC 
CTTCTTTTGT 
TGCCATTAGG 
TTTTCTATTC 



16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 



40 



45 



50 



AATATTGTTT 
AATAGAATTG 
CTGLAAAATAT 
AACGTTATCA 
TGTCGGTTTG 
AGACTATGCC 
ACTTGCAAAA 
TGTACCAGCG 
AAATCA CAT A 
TGCTGAGCAT 



TCAAAGTACC 
GTACATGGAA 
GAATATGAAA 
TATACGTGGG 
ACATGACAGG 
TTAATAGATG 
TTAGCAGATC 
TTTGCGTGTA 
CGAGTTGGCT 
TTTAGAATGA 



ATGGAAAGAA 
AGTATTTTTA 
AAGAAAAATA 
TATATGAAGA 
ATAAGTTTGG 
AAGGTAAGGA 
GACTTGGCTT 
GTAGTCCAGA 
CTGGTGGTGT 
TGGCAGCGTT 



TGAATAATCA 
AAATTAAACT 
AAGGCGAAAA 
GGGAATGGTA 
AGATGACGGA 
TGCACAAAAG 
TAAGCGAATT 
ACTTTTGATG 
GATGCTGCCG 
ATATCCAAAT 



ATGATGAACA 
AATGAATGGC 
GATATAAAAG 
TTAAGAACGC 
TTGGTTAAAT 
GCATTGCAAG 
TGGTTTACGG 
ATGCATACAT 
CACTATCGAC 
CGTATTGATT 



GTCTTGATAG 
ATTTGTAGGT 
TTAATTGAAA 
TAAAATGTTA 
TAAGCGTATT 
ATTCAGTGAC 
AACATCATAA 
TGGCGCAGAC 
CTTATAAAAT 
TAGGTATTGG 



17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
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TAGTTACGAT GAATCGATTT CGTTATTACG TGATTATCTT ACAATAAAGG ATAAACCAAG 18600 

TGCGCATACG TTAGGTGTCC AACCACACAT TGATCATTTT CCAGAAATGT GGTTATTAAG 18660 

TAGTAGCGCA ACATCTGCCA AAATAGCTGC CG AACTAGGT ATAGGGCTTT CTGTTGGAAC 18720 

ATTTTTGCTA CCAGATATAA ATGCGATACA TACAGCGAAG GATAACATTG ATATTTACAA 18780 

AAAACATTTC CAAGCATCAA CGATTAAAAT GGACGCAAAG GTGATGGCAT CTGTATTTGT 1884 0 

CATTGTAGCT GATAACGAAG CGGAAGTAGC AG CATTACAA CATGC CTT AG ATGTTTGGTT 18 900 

ATT AGG T AAA TTACAATTTG CAGAATTTGA AGATTTTCCT TCAGTAGACA CAGCACAAAA 18 960 

GTATAAGCTT AATGATCGAG ACAAAGAGAT GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

AGGTACACAA GAAAAGGTTA AAGCACAATT AGATGATTTC ATTG CTACGT TTGAAGTTGA 19080 

TGAGGTGTTA GTAGCACCGC TTATTCCAGG TATTGAACAG CGTTGTAAAA CATTAAAATT 19140 

20 ACTCGCGGAA ATTTATTTGT AGCATTTTAA ATAGAAGAGA AAGGATGAAG ATAAGATGAA 192 00 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

ACCTGAATTG CAAGATGATA TTGGGACAGT AGGTTATGTT GAATTCGTAA GTCCAGATGA 19320 

25 AGTTAAAGTG GATGATGAAA TTGTGAGTAT CGAAGCATCG AAAACGGTCA TTGATGTGCA 19380 

AACGCCATTG TCAGGAACGA TTATTGAGCG AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

TTTAAACTCT GAAAAACCAG AAGAAAATTG GTTGTTCAAA TTGGATGATG TCGATAAAGA 19500 

30 AG CATTCCTA GCATTACCGG AGGCTTAAAT GGAAACGTTA AAATCAAATA AAGCGAGACT 19560 

TGAATATTTA ATCAATGATA TGCATCGAGA GAGAAATGAC AATGACGTAT TGGTAATGCC 1962 0 

ATCTTCATTT GAAGATTTGT GGGAATTATA T CG AGG CTT A GCAAATGTCA GACCGGCATT 19680 

ACCTGTAAGT GATGAATATT TAGCTGTACA AGATGCTATG TTAAGTGATT TGAATCGTCA 1974 0 

ACATDTTACG GATTTGAAGG ATTTGAAGCC GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

AGGTGATATC ACGACGTTAA AAATCGATGC TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19860 

AGGATGTATG CAAG CTAATC ATGACTGCAT TGATAATATT ATTCATACAA AAGCGGGTGT 19920 

TCAAGTTGGA CTTGATTGTG CAGAGATCAT TCGACAACAA GGG CGCAATG AAGGTGTAGG 19980 

TAAAGCCAAA ATAACACGTG GATATAATTT GCCAGCAAAG TATATAATTC ATACGGTTGG 20040 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA GATGAATCAG GACTTGTTAG CTAAATGTTA 20100 

TCTTAGCTGT CTTAAATTGG CTGATCAACA TAGTTTAAAT CATGT CGCTT TTTGCTGTAT 2 0160 

so ATCTACAGGT GTATTTGCTT TTCCTCAAGA TGAAGCAGCA GAAATTGCTG TTCGAACAGT 20220 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC ATTGAAAGTC GTGTTCAATG TATTTACAGA 20280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGG CAT GTCTGCATCT GACGGATTTA 20460 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 20520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 20580 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCG CTT 2064 0 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

AACAGTGTAG cTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATGATTT AATTCGTAAA 2 0820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 20880 

TGTGATG CCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 20940 

GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTATATGA CGATGAATAA AAAGG CAT AT 21120 

25 CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATTACAGCAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 2124 0 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 21300 

CGCTATGCAA GTTTATGTTA ACCAGCATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 213 6 0 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

~GTATGTTGAT-AAAGGTGCCG~TTAAtAt^^ 21540 

TGGTSATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAG CGGT A GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

CGCAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 21780 

45 TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

SO TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22080 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACG CATCAGT TAAAGCAATT 22200 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 22260 

TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22320 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 223 80 

AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 2256 0 

CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22 620 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 22680 

GATTTAAGTA AAG CTG AT AA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 2 2740 

20 CGATTAATGT TTGTCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22 800 

GATTGCAAGT ATGTCTTGAG TGTAG CATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22 860 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TT AAG AG CTG CAGAACATGC 22920 

25 GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22 980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23 040 

ATATGAAAAA TCAAAATTGT TAGCTGAAAA GG CAGCGTGG GATTTTGTTG AGAATGAAAA 2310 0 

TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 2 316 0 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 2 3 220 

ACCG CAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 23260 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 2334 0 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 234 00 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 2343 9 

(2) INFORMATION FOR SEQ ID NO: 39; 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4522 base pairs 
4S (B) TYPE: nucleic acid 

(C) STRAND EDNESS ; double 
<D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 60 
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TATTATGCAG TCGATTTAGG GAAATCATAT CGTCTAATTG ACGAAAGCAT GTTAGAGGAT 180 
TTGAAGTTAA CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATGTTAG AAAATTGTCA 24 0 

AATTCATATA CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 300 
GGGTATGATG CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 360 
CAAGGCGAAA TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 4 20 

AATAAAACAG GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 4 80 

GTTCCAATTA CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 54 0 

TTAGGTAAAA ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 6 00 

CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTGGATA AGGAGTTTTG TCATAATGAA 660 
TTTATTTTAC AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTGA 720 
AGGTGAATTA AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAATGT 780 
TGTAGGTTAT AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT 84 0 

TAAATTAACT GATGAACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 900 
25 TTATAAATTA AATGCTGATC TATCACCGAA ATTTGTAGTT GG CTACGTTG AAACTAAAGA 960 

CAAACATCCT GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 1020 

ACAAATTGTA TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 108 0 

AGGTGCAGTG ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 114 0 

AAGCGGTATG ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 12 00 

TATTATGGTA TTAAATGACA GCTATGAAAT TGGACAAGCA TT t TTTGAAT AATTAAGGAA 1260 

-GGTAGTGAAA-ATATGAGGTG-GTTTGATAAA- J TTATTeGGeG-AAGATAATGA-TTCAAATGAT r320~ 

GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 13 80 

GACTCATTAC TGCCTCAAAA TAATGATATT TATAGTCGTC CGAGGGGAAA ATTCCGTTTT 14 40 

CCTATGAGCG TAGCTTATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 1500 

GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 1560 

CGACATCGCC GTAGAAGAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 1620 

AATTCTAAAA TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 1680 

AAG CCAGGTA CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCCAAAA 174 0 

SO ACACATAATA TGTATTCTAA TAATACAAAT CATCGTG CTA AAGATTCAAC TCCAGATTAT 1800 

CACAAAGAAA GTTTCAAGAC TTCAGAGGTA CCGTCAGCTA TTTTTGGCAC AATGAAACCT 1860 
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AAACAAAAAT 


ATGATAAATA 


TGTAGCTAAG 


ACGCAAACGT 


CTCAAAATAA 


ACAATTAGAA 


1980 




CAAGAAAAAC 


AAAATGATAG 


TGTTGTCAAA 


CAAGGAACTG 


CATCTAAATC 


ATCTGATGAA 


2040 


5 


AATGTATCAT 


CAACAACAAA 


ATCAATGCCT 


AATTATTCAA 


AAGTTGATAA 


TACTATCAAA 


2100 




ATTGAAAATA 


TTTATGCTTC 


ACAAATTGTT 


GAAGAAATTA 


GACGTGAACG 


AGAACGTAAA 


2160 




GTGCTTCAAA 


AGCGTCGATT 


TAAAAAAGCG 


TTGCAACAAA 


AGCGTGAAGA 


ACATAAAAAC 


2220 


10 


GAAG AG CAAG 


ATGCAATACA 


ACGTGCAATT 


GATGAAATGT 


ATGCTAAACA 


AG cGGAACg C 


2280 




TATGTTGGTG 


ATAGTTCATT 


AAATGATGAT 


AGTGACTTAA 


CAGATAATAG 


TACAGATGCT 


2340 


15 


AGTCAGCTTC 


ATACAAATGG 


CATAGAGAAT 


GAAACTGTAT 


CAAATGATGA 


AAATAAACAA 


2400 


GCGTCAATAC 


AAAATGAAGA 


CACTAATGAC 


ACTCATGTAG 


ATGAAAGTCC 


ATACAATTAT 


2460 




GAGGAAGTTA 


GTTTGAaTCA 


AGTATCGACA 


ACAAAACAAT 


TGTCAGATGA 


TGAAGTTACG 


2520 


20 


GTTTCGAATG 


TAACGTCTCA 


ACATCAATCA 


GCACTACAAC 


ATAACGTTGA 


AGTAAATGAT 


2580 


AAAGATGAAC 


TAAAAAATCA 


ATCCAGATTA 


ATTGCTGATT 


CAGAAGAAGA 


TGGAGCAACG 


2640 




aATAAAGAAG 


AATATTCAGk 


AAGTCAAATC 


GATGATGCAG 


AATTTTATGA 


ATTAAATGAT 


2700 


25 


ACAGAAGTAG 


ATGAGGATAC 


TACTTCAAAT 


AT CGAAG AT A 


ATACCAATAG 


AAACGCGTCT 


2760 




GAAATG CATG 


TAGACGCTCC 


TAAAACGCAA 


GAGTACGCAG 


TAACTGAATC 


TCAAGTAAAT 


2820 




AATATCGATA 


AAACGGTTGA 


TAATGAAATT 


GAATTAGCAC 


CGCGTCATAA 


AAAAGATGAC 


2880 


30 


CAAACAAACT 


TAAGTGTCAA 


CTCATTGAAA 


ACGAATGATG 


TGAATGATAA 


TCATGTTGTG 


2940 




GAAG ATT CAA 


GCATGAATGA 


AATAGAAAAG 


AATAACG CAG 


AAATTACAGA 


AAATGTGCAA 


3000 




AACGAAG CAG 


CTGAAAGTGA 


ACAAAATGTC 


GAAGAGAAAA 


CTATTGAAAA 


CGTAAATCCA 


3060 


35 


AAGAAACAGA 


CTGAAAAGGT 


TTCAACTTTA 


AGTAAAAGAC 


CATTTAATGT 


TGTCATGACG 


3120 




CCATCTGATA 


AAAAGCGTAT 


GATGGATCGT 


AAAAAGCATT 


CAAAAGTCAA 


TGTGCCTGAA 


3180 




TTAAAGCCTG 


TACAAAGTAA 


GCAAGCTGTG 


AGTGAAAGAA 


TGCCTGCGAG 


TCAAG C CACA 


3240 


40 


CCATCATCAA 


GATCTGATTC 


ACAAGAGTCA 


AATACAAATG 


CATATAAAAC 


AAATAATATG 


3300 




ACATCAAACA 


ATGTTGaGAA 


CAATCAACTT 


ATTGGTCATG 


CAGAAACAGA 


AAATGATTAT 


3360 


45 


CAAAATGCAC 


AACAATATTC 


AGAGCAGAAA 


CCTTCTGTTG 


aTTCAACTCA 


AACGGAAATA 


3420 


TTTGAAGAAA 


GTCAAGATGA 


TAATCAATTG 


GAAAATGAGC 


AAGTTGATCA 


ATCAACTTCG 


3480 




TCTTCAGTTT 


CAGAAGTAAG 


CGACATAACT 


GAAGAAAGCG 


AAGAAACAAC 


ACATCCAAAC 


3540 


SO 


AATACTAGTG 


GACAACAAGA 


TAATGATGAT 


CAACAAAAAG 


ATTTACAGTC 


ATCATTTTCA 


3600 




AATAAAAATG 


AAGATACAGC 


TAATGAAAAT 


AGACCTCGGA 


CGAACCAACA 


AGATGTTGCA 


3660 
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CCAAGTGTTT 


CATTACTAGA AGAACCACAA 


GTTATTGAGT 


CGGACGAGGA 


CTGGATTACA 


3780 


GATAAAAAGA 


AAGAACTGAA 


TGACGCATTA 


TTTTACTTTA 


ATGTACCTGC 


AGAAGTACAA 


3840 


GATGTAACTG 


AAGGTCCAAG 


TGTTACAAGA 


TTTGAATTAT 


CAGTTGAAAA 


AGGTGTTAAA 


3900 


GTTTCAAGAA 


TTACGGCATT 


ACAAGATGAC 


ATTAAAATGG 


CATTGGCAGC 


GAAAGATATT 


3960 


CGTATAGAAG 


CGCCTATTCC 


AGGAACTAGT 


CGTGTTGGTA 


TTGAAGTTCC 


GAACCAAAAT 


4020 


CCAACGACAG 


TCAACTTACG 


TTCTATTATT 


GAATCTCCaA 


GTTTTAAAAA 


TGCTGAATCT 


4080 


AAATTAACAG 


TTGCGATGGG 


GTATAGAATT 


AATAATGAAC 


CATTACTTAT 


GGATATTGCT 


4140 


AAAACG CCAC 


ACGCACTAAT 


TGCAGGTGCA 


ACTGGATCAG 


GGAAATCAGT 


TTGTATCAAT 


4200 


AGTATTTTGA 


TGTCTTTACT 


ATATAAAAAT 


CATCCTGAGG 


AATTAAGATT 


ATTACTTATC 


4260 


GATCCAAAAA 


TGGTTGAATT 


AGCTCCTTAT 


AATGGTTTGC 


CACATTTAGT 


TGCACCGGTA 


4320 


ATTACAGATG 


TCAAAGCAGC 


TACACAGAGT 


TTAAAATGGG 


CCGTAGAAGA 


AATGGAACGA 


4380 


CGTTATAAGT 


TATTTGCACA 


TTACCCATGT 


ACGTAnTATA 


ACAGCATTTA 


ACnAAAAAGC 


4440 


CCCATATGAT 


GAAAGAATGn 


CAAAAATTGT 


CATTGTAaTT 


GATGAGTTGG 


CTGATTTAAT 


4500 


GATGATGGTC 


CGCAAGAAGT 


TG 








4522 


(2) INFORMATION FOR SEQ ID NO: 40: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 <B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



-<xi.)_SEQUSNCE_DESCRI P-T-ION-:— SEQ-I-D-NO - : -40 : 
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TCAAGTTTAC 


GGATACGTAT 


ATATTTTGCA 


TGACATTTAG 


TGCAATAATA 


TTCATAATTT 


60 


GCCCGTTGTT 


GATAGCTTTC 


AATG CTGTTA 


CAAAATCTAG 


GCGCTCCAAC 


CTGTTGGCTC 


120 


AATCGTTTAA 


AATCTTGATC 


TTTATGTTGA 


TAACCTTTAC 


CAGCAATATG 


CAAGTGATAA 


180 


TGACACAATT 


CGTGCAGTAT 


AATTTTTACA 


ACAGCATCTT 


CTCCATAATG 


CTCATATTGT 


240 


TTTGGATTAA 


TTTCAATATC 


ATGGGACTTT 


AAAAGATAAC 


GTCCGCCTGT 


TGTACGTAAC 


300 


CTTTTATTAA 


AATATGCACA 


ATGTCGAAAC 


GTACGTCCAA 


ATTTTTCTTC 


CGAAAGATTC 


360 


TCAACCATTC 


GCTGAAGTTT 


GTCATTATTC 


ATGTGGATCA 


ATCATCGTTA 


ATGATACTTT 


420 


GTCTTTATTT 


TTGTCAATAC 


TGTAAATCCA 


AACGTCAACG 


ATATCACCAA 


CACTGACAAT 


480 


ATCCATTGGA 


TTTTTTACGA 


ACTTCTTAGA 


AAGTTTCGAA 


ACATGGACAA 


GTCCATCTTG 


540 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAG CACAT CG GATTTAAGGA TTGGTGTTTC 66 0 

AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 720 

5 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

is 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GAT C CAT ACA TACGCGACCA 60 

20 ATAACTTCGC ATTGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 120 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 18 0 

GCACCATAAC TTACAGACTC ACCCGCTTGT AGCGTCTTTG TTTGAACTAC ATT AG CAATT 24 0 

25 AATTG CACAC TTGGTTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 30 0 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAATAGA 360 

GAGCCTGCTG AGTTCTGACA ATGTATATAT TCAGGTTTAA TTGCTTCATT GAC CATATCT 42 0 

TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 80 

GTAAACACGC CTTCAAATAC AAGTTG CTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 54 0 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 600 

CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTT AATTG CTTCTTTCAA CCACTGTTTA 66 0 

GACGGAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 72 0 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 78 0 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTTTCCATTA AATGACGTGC TACTTTAACA 840 

CTACCTAGTC CATAGGCATT GGCTTTAACG ACAGC CAT CA CTGTTTTATT TGGATGCAAT 900 

GTACTGAATA CTTTGAAATT TGATGCAACA GCGTTTAAAT CTACATTCAT ATACG CAGAT 960 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 102 0 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATGCTGACTT TTCTAAAACA ACTTGG 1076 
(2) -INFORMATION FOR SEQ ID NO: 42: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 6 0 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTG CCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 24 0 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 3 00 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 3 60 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

20 GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 4 80 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

25 TGCAAGATAT TTACTTTTTA GAG CAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 660 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 7 80 

30 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 84 0 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 9 00 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

_3$ 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

CATCSGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 114 0 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 12 00 

ATT AT AG CT A CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GT AT ATT CAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 13 80 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 1440 

so GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 1560 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 1680 

CATGGTAAAT TTGATTCTCA AC CTGAAT AT AAAAAGCCAC CATTCCCAAC TGATGGATAC 174 0 

GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 1800 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG I86 0 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 192 0 

GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 1980 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2 04 0 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 216 0 

AT CACT ACT A AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2 22 0 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 22 8 0 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 2 34 0 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 24 0 0 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 24 6 0 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2 520 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT ATAAATGTTC GATAATACAC 25 8 0 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 264 0 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 27 00 

AGGTAACTAT ATATGGCTAA GAAATCTAAA ATAGCAAAAG AGAGAAAAAG AGAAGAGTTA 27 6 0 

GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 28 8 0 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 293 0 
40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 3606 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 6 0 
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TTATAAAAAA 


CTAATTTTAC 


AAATGCTTTT 


GCGTTCTTAC 


AAAAAATGCA 


TTTGACTATT 


180 




ATTATAATAA 


GCGTATAATT 


GTCGCATATT 


ATTTTTTGTA 


TTTTTGGCAA 


TAACGAAGGA 


240 


5 


GTATTTATGA 


ATAAAGACAA 


GCAATTGCAC 


AACGACAAAA 


TCAATCTATC 


CCAATTAGTC 


300 




TTATTAGGGT 


TAGGCTCTTT 


AATAGGATCT 


GGTTGGCTAT 


TTGGTGCGTG 


GGAAGCATCA 


360 




TCAATAGCTG 


GACCAGCAGC 


AATCATATCA 


TGGGTTCTTG 


GATTCCTAGT 


CATTGGAACC 


420 


10 


ATTGCCTATA 


ACTACATTGA 


AATCGGCACA 


ATGTTTCCTC 


AATCAGGTGG 


CATGAGTAAC 


480 




TATGCCCAGT 


ATACACATGG 


CTCATTATTA 


GGCTTTATTG 


CTGCTTGGGC 


GAATTGGGTG 


540 


75 


TCTTTGGTGA 


CAATAATACC 


TATCGAAGCT 


GTGTCAGCTG 


TTCAATATAT 


GAGTTCTTGG 


600 


CCGTGGCATT 


GGGCGAAACC 


AATGAGATAT 


TTAATGGAAA 


ATGGCTCTAT 


TAGCACATAC 


660 




GGATTGCTAG 


CTGTATATCT 


CATCATTGTT 


ATTTTTTCAT 


TATTAAACTA 


TTGGTCCGTA 


720 


20 


AAACTTTTAA 


CATCATTTAC 


GAGTTTAATT 


TCTGTATTTA 


AATTAGGCGT 


ACCCATGTTA 


780 


ACCATCATCA 


TGTTGATGCT 


ATCAGGATTC 


GACACTTCAA 


ATTACGGCCA 


TTCGGCAAGC 


840 




ACATTTATGC 


CTTACGGAAG 


TGCAC CGATT 


TTTGCTGCAA 


CAACAGCATC 


AGGGATTATT 


900 


25 


TTTTCATTCA 


ATTCATTCCA 


GACAATTATT 


AATATGGGTT 


CAGAAATTAA 


AAATCCTGAA 


960 




AAAAATATCG 


CAAGAGGCAT 


CGCTATCTCA 


CTGTCAATCA 


GTGCAGTGTT 


GTACATCATT 


1020 




TTACAAAGTA 


CGTTTATCAC 


TTCTATGCCT 


CAATCAATGT 


TACAACATAG 


TGGATGGAAT 


1080 


30 


GGCATCAACT 


TCAATTCACC 


ATTTGCTGAT 


TTAGCTATCT 


TATTAGGAAT 


TAATTGGCTC 


1140 




GCAATTTTAC 


TATACATTGA 


AGCTTTTGTA 


TCACCATTCG 


GTACTGGCGT 


GTCATTTGTC 


1200 




GCCGTTACAG 


GTCGAGTTTT 


ACGAGCAATG 


GAGAAAAATG 


GACATATCCC 


TAAATTTCTT 


1260 


35 


GGGAAGATGA 


ATGAAAAATA 


TCATATCCCA 


CGTGTAGCAA 


TCATCTTTAA 


TGCCATCATT 


1320 




AGTXTGATTA 


TGGTTACATT 


ATTTAGAGAT 


TGGGGTACGC 


TAGCAGCAGT 


TATTTCTACT 


1380 




GCAACTTTAG 


TAGCCTATTT 


AACTGGCCCA 


ACGACAGTGA 


TTGCATTAAG 


AAAAATGGGA 


1440 


40 


CCAACAATGA 


CTCGTCCATT 


TAGAGCAAAA 


ATTTTAAAAG 


TAATGGCACC 


ATTATCATTT 


1500 




GTATTAGCTT 


CATTAGCTAT 


ATATTGGGCA 


ATGTGGCCAA 


CAACGGCTGA 


AGTTATTTTA 


1560 




ATCATTATAC 


TTGGATTACC 


AATCTACTTC 


TTCTATGAAT 


ATCGTATGAA 


TTGGCGTAAT 


1620 


45 


ACAAAGAAAC 


AAATTGGTGG 


TAGCTTATGG 


ATTATTGTAT 


ATTTAATCGT 


GCTATCAATA 


1680 




CTGTCATTTA 


TAGGAAGCAA 


AGAATTTAAA 


GGCTTAAATA 


TGATTCACTA 


TCCATTTGAC 


1740 




TTTATCGTTA 


TTATTATTGT 


GGCACTTATC 


TTCTATTACA 


TCGGTACAAC 


GAGTTCATTT 


1800 


50 


GAAAGCGTCT 


ATTTCCGTCG 


CGCAACACGA 


ATCAATACGA 


AGATGCGTGA 


GTCACTAAAT 


1860 
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CACACACATT AAC CAACCAT TGATTTCAAC ATCTTGGTTG GTTTTTTATT TTGAAAATCG 1980 

GTTATAAATA ACTAACATAA CAAGATGATG ATCAGGCTGG GACATAAATC AATGTTCTAT 2040 

GCTCTACGAA gTTATATTGG CAGTAGTTGA CTGAACGAAA ATGCG CTTGT AACAAGCTTT 2100 

TTTCGATTCT AGTCAGGGGC CCCAACACAG AGAATTTCGA AAAGAAATTC TACAGGCAAT 2160 

GCAAGTTGGG GTGGGACGAC GATAAAGAAA TACTTTTTCT ATAGAAATTA GTATytCTTA 2220 

TGCATGAGTT TTACT CATGT ATTCATATTT TTAAGTACAC ATTAGCTGTG GCTAATGTAT 2280 

AAGAACCACT A CAT AAT AAA TCATTTGTGG CTCTTTATCA TTTCTGTCCC ACTCCCGTAG 234 0 

AAGTACATCA TATAATGCTG AAAATGGTTT GAGTTAAAAC AGATATGAAG CTCGTCTGAT 24 00 

TCAGTCACAA AATTGTCTTG TTATACTTGT CACCTATCAT CTATAGACCG TGGTATGATT 2460 

AAATTGGGGA TGATAAAGGA GGTTAATAAA TATGAAGATT AATACTACAG GTGGTCAAAT 2 520 

TCATGGTATT ACACAAGATG GTTTAGATAT CTTCTTAGGC ATTCCTTATG CAGAACCACC 2 580 

AGTTCATGAC AATCGCTTTA AACATTCTAC GTTAAAAACA CAATGGT CAG AGCGAATTGA 2 640 

TGCAACTGAA ATACAACCCA TCCCACCGCA ACCAGACAAC AAATTAGAAG ATTTTTTCTC 2700 

CTCACAATCT ACAACTTTTA CTGAACATGA AGACTGTTTA TATCTAAATA TTTGGAAACA 2 760 

ACATAATGAT CAGACGAAGA AACCTGTCAT CATTTATTTT TATGGTGGTA GTTTTGAAAA 2 820 

TGGTCATGGT ACAGCCGAAC TCTATCAACC GGCACATTTA GTACAAAATA ACGACATTAT 2 8 BO 

CGTTATTACA TGCAATTATC GTTTAGGCGC ATTAGGATAT TTAGACTGGT CATATTTTAA 2940 

TAAAGATTTT CATTCCAATA ATGGCCTTTC AGATCAAATC AATGTCATAA AATGGGTGCA 3000 

T CAATTT ATT GAATCCTTCG GTGGCGACGC TAATAACATT ACTTTAATGG GTCAGTCTGC 3060 

AGGCAGTATG AGCATTTTGA CTTTACTTAA AAT AC CTG AC ATTGAG CCAT ACTTCCATAA 3120 

AGTQGTTCTA CTAAGTGGCG CACTACGATT AGACACCCTT GAGAGTGCAC GCAATAAAGC 3180 

ACAACATTTC CAAAAAATGA TGCTCGATTA TTTAGATACA GATGATGTTA CATCATTATC 3 24 0 

GACAAATGAT ATTCTTATGC TGATGGCGAA gcTAAAACAA TCTCGAGGAC CTTCTAAAGG 3300 

GCTTGATTTA ATATATGCGC CTATTAAAAC AGATTATATA CAAAATAATT ATCCAACAAC 3360 

GAAACCAATT TTTGCATGTT ATACAAAAGA TGAAGGCGAT ATTTATATTA CTAGTGAACA 3420 

45 GAAAAAATTA TCGCCGCAAC GCTTTATCGA CATTATGGAA TTAAATGATA TTCCTTTAAA 34 80 

ATACGAAGAT GTTCAGACGG CGAAGcAACA ATCTTTAGCG ATTACACATT GTTATTTCaA 3 54 0 

ACAG CCGATG aAGCAATTTT TACmACmACT CAATATACmA GATTCCAACC GCACCAACTA 3600 

SO TGGCTT 3606 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15109 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : doub 1 e 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

10 

GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 6 0 

AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAACsAG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 180 

75 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 24 0 

ACCTACACCA GGATACGAAT CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 360 

20 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA GCTATATAAG 420 

TTAGTGAAAT GAGAGTCTGA AACATATCAA TCTTTTGATA TTGTATTAGG CTCTTATTTT 4 80 

TATAGCTAGA AAGTTAGATA TTTGTATTTT TTTAAATAAT AAGTGCCGTT GTTATCGTTC 54 0 

25 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

TCAATGTATC TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA 66 0 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAGAGGTTTA AAAGGATTAC AACAGAATGT 72 0 

30 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA ACATAATATT GTTAATGGAA 780 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 84 0 

3 S GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GTCAGTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATGG ACGCATTGCC AATTTTAAAT 1080 

GAG CCAGTCA AACATGTTGC TTCAAATGTG AGCGACAAAG AAG CCTTTG A AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AGATTCATCT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 1260 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTG AG CATG A TGGTTCTGCA 1320 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG TTTTTGATAG CGATATAGAG 1380 

50 ATTGATATGG TTGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 1440 

55 



361 



GTACCACCAT TAACACACTT TATAACAAAT 
ACTGCTTATA TTGGTAACAA GGACTTGGTT 

5 AGTTATGGAC CAAGTCAAGG TAAGACATTT 

ATAAATCATG TAGATAACAA CGCATTTTTT 
AATTAACAGC TGTGTAGAAT AATTAAGGTT 

W TTTTCATTTC TTAAAGTTTA CAATGGTGCT 

TAAAAAATGA CAACAAAACA GTTAGTATAT 

TTAGGATTGG TACCGGTAAT TCCACTACCA 

15 

ATTGGTATTT TCTTAGCAGG TGCGATTTTA 

GTCTTTTTAT TATTAGTAGT TGCTGGCTTG 

GGTGTATTCG CAGGTCCTTC AGCAGGGTTT 

20 

ATTGGGGCGA TT C GAG AT AG ATTCATCAAT 
ATTTTAGTTT TTGGTGTTAT AGCATTAGAT 
ATTAACATAC CATTTACGAA AGCTATTTCA 

25 

TTAAAAGCAA TTGTAGCAAG TTTGATTGGT 
CAAATTATGG GAATAAAATA AT CAT ATTT A 
GAAATTTATA AAAGTGAAAG GAGTAGGTGT 

30 

ATTGTAACGG CA CT AT ATTT GAAAATGACG 
TTCCACAACG ATACACAAGT AACACATGGA 
GGCTGAAAGA CTTAGAACGT CAACATCAAT 

35 

CGTTTAGTTT CCCGGAAAAT GAACAACTTG 
TGAATTTTGA ACTAGGTATT ATGGAATTGT 
TGCCGCGTAA CTCTGACGTT GAAATTGCCA 

40 

TAAAAGTTGC ATATCAGTTT AGTTTG C CAT 
AAATGGTAAG GG AACATTAT CAAAAAGATG 

45 ATGAACCTAT TGGCGTTGTA GATGTCATTG 

TTGGTGTATT AGAACAATTT CGGCACCAAG 
GTGAATACGC CATATCAAAA AATCACAAAC 

50 CAGCAAAAGA TATGTATGCA AAGCAAGGTT 
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TCTACTTACT TTTTATGGGA TGTTTTAACG 1560 

CATTCAATTG AGAAAAAAGT CGATGTAATA 1620 

GAGTGTAAAG ATGGGCGCAA AATTAATGTC 16 BO 

GATTATATAA CTGCACTTGC TAAAAAAGTA 174 0 

TTAATTTATA TAGAACAACT TATTGTAAAC 18 00 

ATAATAATGG TCATGAAATA CGAAAGGAAG i86 0 

ACAGCTTTAA TGACAGCGAT TATCGCTATT 192 0 

TTTTCTTCAG TACCAATTGT ACTTCAAAAC 1980 

GGACGTAAAT ATGGCACATT AAGTGTTATC 204 0 

CCATTGTTAT CAGGTGGTCG CGGTGGCATC 2100 

TTACTATTAT ATCCAGTTGT AGCATTCATG 216 0 

GAAATTAATT TCTGGATTTT ATTCGTTGGT 2220 

GTTATTGGTA CATTGATTAT GGGCATGATT 2280 

ATTTCATTAG CTTATTTGCC TGGTGATATA 234 0 

ACAGCTTTAC TTAATCACTC GCAGTTTCGT 24 00 

AGATAGTAAA GTAATTGAAT AAGTTGCTTT 246 0 

CAATGGCTAG TATAAGTATG TCAGATATAT 252 0 

ACGAGCAGTT GATTTATTTA AC GCCTTCTT 2580 

TATATAAAAA GACGCCTACC CAAGAGCGAT 2640 

TACATACAAA TCAAGGTTCA AATCATTATG 2700 

ATAATCATTG GATGGCTATG TTTAAAGATA 2760 

ATGCCATAGA AAGTGATGCG CTTGCCAATT 2820 

TCGTTGACGA GTCGCATATA GATGCCTATT 2880 

TTGGAAAAGA CTATGCAGAT GCACATGAAG 2940 

TGATTAAACG CTTAGT AG CT TATTTAAATA 3000 

AAAGTGAAAA TTACATTGAA TTAGATGGAT 3 06 0 

GAATTGGATC TACAATTCAA TCGTTGATAG 3120 

CAATCATATT AGTTG CAGAT GGTGAAGATA 3180 

ATGTCTATCA ATCGTTTTGT TATCAAATAT 3240 
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TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA ATTGTTTTGA GCTACTTATA 3360 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 34 20 

CTA7AATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 34 80 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 354 0 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGGCTTTA 36 00 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 36 60 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAG CGT TATGTCTACC AATGTCTTGT 3 720 

CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 3780 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG CATTGGAGTT 384 0 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3 9 00 

20 CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3 960 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 4 020 

AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 4 080 

AGGTTTGTTG GACTGCATAT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 414 0 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AGCTTAAACT 4260 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 4 320 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 43 80 

GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA ATCAATATAT 4440 



25 



30 



-35- 



40 



AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4 500 

TTAAAAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4560 

CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4 620 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 46 80 

ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 4 74 0 

45 TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4 800 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 860 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4 920 

so 

GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 4980 

GGAAAATATG CGAAACAGTA TTTAGATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5040 
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CAAGGTTTTG TGTATAAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA ~ 5i 60 

GTAATTAAAG AAGTAGAACT TAAGAAGCCA ATCACATACG AAGCTGGTGC TACATCAGAT 5220 

AGTAAATTAG CAAAAGAGTG GATGGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 5340 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 54 00 

ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 54 60 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACCA ATCGGTCAAT TCTTTGCGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 5580 

GACAGGTGCT GTGATAG CAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT 564 0 

GCAAGGCTTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

TGAAACGAAA ATTTTCCTCA AATTAATTTT ACCATTAGCT AAACGCTCTA TTTTAGCAGG 5760 

TATAATGATG AGTTTTG CTC GTGCATTAGG TGAGTTTGGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

25 TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 594 0 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6060 

30 AAAAATTTAT GCAGTTCGTG GTCCATCTGG CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAGTG AATGGGCAAT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTCA ACAACGACGT ATTGGATATC TGTTTCAAGA 624 0 

CTAC CAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 6300 

TGAAeACATC GAT CAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6360 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 6480 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 6540 

ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6600 

T ATTTG C CAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGG CAGTTT TTTTGAGAGA CATTGACGCG 6720 

SO CGT CATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6780 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6 84 0 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG ATAGAGATTA TATTGAATTT AGTAATTTAC 6960 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTCCAG 7020 

CTAAAAAGCA TTTGCTAGCG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG 7080 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 714 0 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCTTGGA 7200 

TTTATGGTGG TGTTGTACAG AGTACATATA CAGAAGCTGC ATTTATACCT GGTAAAACAC 72 60 

CTTGCTTTAA CTGTTTGG T A CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 732 0 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 73 BO 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG GAAGGTAGTC 744 0 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

20 CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATGC AACATTGTGT GGTAGAGACA 756 0 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGACATTCT TGTTCAATTT TTAAAACAAC 7620 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT TGAATTTAAA GGACACCGCA 768 0 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATGGCAT GACACGCACA TCAGATGCCA 774 0 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 7800 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 7860 

30 

CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7920 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCCA 7980 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGTCATCATT ACGACTGGTG 8040 

35 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGAtAGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

4Q GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC 8280 

ATCTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 834 0 

45 TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 84 00 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATTTC 8460 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 8520 

ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 8580 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 864 0 
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AATGCTTGAA TGAGCGACAG CAGTTCTTTT TGTAATTTGT TTGTCTGATA CATCGACCAT 8760 

TTTGGCGTGG CCTTGTTGAT TAATATGAGT AAACTCAGTC ATTTTACCCC TCCTAGTGCA 8820 

TCTAGTATAT CATGAAAAAA TAAAAGTTTT GGAGATGATT TTTAATGGTA GTAGAAAAAA 8880 

GAAACCCAAT CCCAGTTAAA GAAGCAATTC AACGTATCGT TAATCAGCAG AGTTCAATGC 8940 

CGGCAATTAC GGTAGCACTT GAAAAAAGTC TAAATCATAT CTTAGCAGAA GATATTGTAG 9000 

CTACTTATGA TATACCAAGG TTTGATAAAT CACCTTATGA TGGTTTTGCA ATTCGCAGTG 9060 

TTGATTCACA AGGGGCAAGT GG TCAGAATC GCATTGAGTT TAAAGTGATT GATCATATTG 9120 

GTGCAGGTTC AGTTTCTGAT AAATTAGTTG GGGATCACGA AGCGGTGCGT ATTATGACTG 9180 

GAGCACAAAT ACCTAATGGC GCAGATGCTG TTGTTATGTT TGAACAAACG ATTGAACTAG 924 0 

AAGATACATT TACAATTCGT AAACCATTTT CAAAAAATGA AAATATATCT TTAAAAGGTG 9300 

AAGAAACAAA GACAGGCGAT GTTGTTCTAA AAAAAGGACA AGTAATTAAT CCAGGGGCTA 9360 

TCGCGGTCCT TGCAACATAT GGCTATGCAG AGGTTAAAGT TATTAAGCAA CCGAGTGTCG 942 0 

CTGTTATTGC AACAGGAAGC GAATTATTAG ATGTTAATGA TGTATTAGAA GATGGGAAAA 94 BO 

25 TTCGTAACTC TAATGGCCCA ATGATTCGTG CCTTAGCAGA AAAATTAGGT CTTGAAGTTG 9 54 0 

GTATTTACAA AACACAAAAA GATGATTTAG ATAGTGGCAT CCAAGTCGTT AAAGAAGCTA 9600 

TGGAAAAACA TGATATCGTT ATTACAACGG GCGGAGTTTC TGTTGGAGAT TTTGACTATT 9660 

TACCTGAGAT TTATAAGGCT GTAAAGGCGG AAGTGTTATT TAATAAAGTA GCAATGCGTC 9720 

CTGGTAGCGT AACAACGGTT GCATTTGTAG ATGGaAAGTA TTTGTTTGGa TTATCTGGAA 97 80 

ATCCATCAGC TTGTTTTACA GGATTTGAAC TATTTGTGAA nCCAGCTGTT AAACATATGT 9840 

GTGGCGCACT AGAAGTCTTC CCGCAAATAA TTAAAGCAAC ATTAATGGAA GATTTTACCA 9900 

AGGGAAACCC ATTCACACGA TTTATACGTG CTAAAGCAAC GTTAACAAGT GCTGGAGCTA 9 960 

CTGTAGTACC TTCAGGATTC AATAAATCAG GTGCGGTTGT AGCGATTGCA CATGCTAACT 10020 

GTATGGTCAT GTTACCAGGA GGGTCACGTG GTTTTAAAGC GGGGCATACA GTAGATATTA 10080 

TATTGACTGA ATCTGACGCT GCTGAAGAGG AACTTCTTTT ATGATTTTAC AAATTGTAGG 10140 

4S TTACAAAAAG TCTGGTAAGA CAACATTGAT GAGG CAT ATT GTCTCTTTCT TAAAGTCACA 10200 

TGGTTATACA GTTGCTACTA TTAAAGATCA TGGGCATGGT AAGGAAGATA TTCAATTACA 10260 

GGATTCAGAC GTCGATCACA TGAAGCATTT TGAAGCGGGG G C AGATCAAA GTATTGTACA 10320 

SO AGGTTTTCAA TATCAGCAAA CTGTAACACG TGTAGATAAT CAAAATCTTA CTCAAATTAT 10380 

TGAAAAATCT GTTACAATTG ACACCAATAT CGTATTAGTT GAAGGCTTTA AAAATGCTGA 10440 
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GAATGTTTGT TATAGCATTA 
GTTATTAAAT AAAATTAAAA 
TGAAACAATT TGAAATCGTG 
TAAATGAATA TCAAGGTGCA 
GCGTCAAAAC GGAATATTTA 
CACAAATTGG AGATGAAATA 
GAATAGGGCC ATTACAAATT 
GTAAAGATGC CTATCGAGCA 
TTTGGAAAAA AGAAATTTGG 
ATGAAGAAGC AAAGAGGGAG 
AGATATATTA CAAAAAGCAC 
ATTTGAAGAT TTATTGTTTG 
TGTAAATGAG GAATTTGTAC 
AATTCCACCG GTTAGTGGAG 
TTCAGTGCGA TTTGGTAAGC 
TAGAGTAATT AAGACATTAG 
TGCGCAATTG GCAACGCAAT 
TGATAAAGGT CCATTAG CAG 
GTTTTTTGTC GTTTCTGTTG 
"TCAGTTTTTA" GTTTCTCATC" 
TGGACGTTTT ATTCCAACAA 
AG CACTACAT TCTGATAATT 
TTTGGATGTA AGGGATGTAG 
TGATTTGGAC GCTTTAATTC 
AAATAAAAGA TAAACTAGGA 
GTAACTTTAG GTGTGATTAT 
TACCTAAAAA TGAACTTTTA 
AATTAGGTGT AAAAAAAATA 
ATGTACTTAT AG CT AAATT A 



ATGTAAGGGA 

ATGATTGTGA 

ACAGAACCGA 

GTAGTTGTTT 

GAATATGAAG 

AATGAAAAAT 

TCAGATATCG 

AATGAATATG 

GAAGATGGTT 

GAATAAGAGA 

AGGAAGATAT 

AACGTTATCC 

AAAAATCGGA 

GTTAAGGGAG 

CCAAAGCTTT 

AATCAACAAA 

TTAAATATCC 

GAATTTATAC 

ATACACCAAT 

TTATTGAAKA" 

TTGCATTTTA 

ACAGTTTTAA 

ATGCGCCCTC 

AAAAATTGTA 

CGTCCCATCC 

TGCATGCCTA 

ACGTTTGATG 

CGCATTACAG 

AATCAAATCG 



G CATGAAG AT 

TACACAATTA 

TACAAACAGA 

TTACCGGTCA 

CGTATATTCC 

GGCCTGGAAC 

CTGTATTAAT 

CAATTGAGCG 

CAAAATGGCA 

GATGAAGGTA 

TGTGCTTGAA 

GCAAATCAAT 

TTTCATTCAA 

CATGAAAGCA 

TGCGGAAGTG 

TATGTTCAAT 

AAATGTTGTT 

AATCATGAAG 

GATTACTGGT 

TCATTTAGAT - 

TAGTCCGAAT 

AAATGTATAT 

ATATTGGTAC 

AGCTGTTAGG 

GTGACTTACG 

AAGAGGTATT 

AAATGGCTAG 

GTGGAGAACC 

ATGGTATTGA 



TTT ACAG CAT 
ACATAGAGGA 
ACAATATCGT 
TGTTCGCGAA 
AATGGCTGAA 
GATAACGAGT 
TGCGGTTTCT 
TATAAAAGAA 
AGGGCATCAA 
CTTTACTTCG 
CAAGCATTGA 
AATAAAAAGT 
CCTAATGATA 
ATAATTCTTG 
AACGGTGAGA 
GAAATTATTA 
ATAGATGATG 
CAACATCCTG 
AAAGCTGTAA 
"GTCGCAGCTT 
GCATTAGGCG 
CATGAATTAT 
AAAAATATAA 
AGGTCCACAA 
GTTATCTGTG 
TGGAGATGAT 
AATCGCTAAG 
ATTGATGCGA 
AGATATTGGT 



TTGAGCAATG 
TTGAAATGAA 
GAATTCACTA 
TGGACTAAAG 
AAGAAATTGG 
ATTGTTCATA 
TCACCGCATC 
ATTGTTCCGA 
AAAGGGAATT 
CAGAAATTAA 
CTGTACAACA 
TTCAAGTTGC 
CTGTTGCATT 
CAGGTGGTCA 
CCTTTTATAG 
TTAGTACAAA 
AGAATCATAA 
AAGAAGAATT 
GCACGTTGTA 
"TTAAAGAAGA"" 
CTATAACTAA 
CAACGGATTA 
ATTATCAGCA 
ATGGTAGAAC 
ACAGATCGGT 
TTCGTATTTT 
GTATATG CAG 
CGGGATTTAG 
TTGACTACAA 



10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 



11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC TATTTCAATC AATCAATAAT CGTAATATTA 12 360 

AAGCGACTAC GATTTTAGAA CAAATTGATT ACGCGACGTC TATTGGTTTG AATGTAAAAG 12420 

5 TAAATGTTGT TATACAAAAA GGTATTAACG ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TTAAAGATAA ACATATAGAG ATTCGATTTA TAGAATTTAT GGATGTTGGT AATGATAATG 12540 

GATGGGATTT CAGTAAAGTT GTAACTAAAG ATGAAATGCT TACAATGATA GAGCAGCACT 12600 

10 

TTGAAATCGA TCCTGTAGAA CCAAAATATT TTGGGGAAGT AG CAAAAT AT TATCGCCATA 12 66 0 

AGGATAATGG TGTTCAATTT GGTTTGATTA CAAGTGTTTC ACAATCATTT TGTTCTACAT 12 720 

GTACACGCGC AAGGCTGTCA TCAGATGGGA AGTTTTACGG ATGTTTATTT GCAACTGTCG 12780 

15 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 12 840 

AATTTAAAGC TTTATGGCAA ATAAGAGATG ATCGATATTC AGATGAGAGA ACTGCTCAAA 12 900 

CAGTTGCCAA TCGTCAACGT AAAAAGATAA ACATGAATTA TATTGGTGGT TAATGTGTAG 12 960 

20 

GGACCACTAC ATATTAAATC ATTAGAGATG TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13 020 

ACAATATTAT TTATTAAAGT AAAAACGGTC ATATCTATGC CAGATTTAAT AGAAATGATC 13 080 

25 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC CCCAACACAG AAGCTGACAG AAAGTCAGCT 1314 0 

TACAATAATG TGCAAGTTGG CGGGGCCCCA ACATAGAGAA TTTCAAAAAG AAATTCTACA 13200 

GACAATGCAA GTTGGGGAAC GGGGCCCCAA CACAGAAGGT GACGAAAAGT CAGCATACAA 13260 

30 TAATGTGCAA GTTGGCGGGG CCCCAACATA GAGAATTTCA AAAGAAATTC TACAGACAAT 13320 

GCAAGTTGGG GATCAACGAA ATAAATTTTA TGAGAATATC ATTTCTATCC CACTCTTAAG 133 8 0 

AATCACTACA TAATAAATCT TTAGTGGTTC TTTAACATTG ATGTCACACT CCATGCCATT 1344 0 

35 GAGTTGTAAT ATATCTTTTT TAGGTATAAA TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13500 

AGATATAAAT CTAAACAAGA TATAGCCAGC AATTTAATAT TTGTAATAGA TAAAATGCTA 13 560 

AGTTTGATAT ATAATAAATT TAAGTAATTG TATAATAATA TGAATTACAA ACATCTAAGA 13620 

40 

AGAAACATAG GAGGCATCAT ATTATGAGTA ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 13680 

GGGAGTTAAG TCAGTTAAAG CACTGGTTAA AAACAACACA TAAGATTTCA ATTGAAGAAT 1374 0 

TTGTAGTCCT TTTTAAAGTG TATGAAGCTG AAAAGATTAG CGGTAAAGAA TTGAGGGATm 13800 

45 

CATTACATTT TGAAATGCTA TGGGATACAA GTAAAATCGA TGTGATTATC CGTAAAaTCT 1386 0 

ATAAAAAAGA GCTTATTTCT AAATTGCGTT CTGAAACGGA TGAAAGACAA GTATTCTATT 13920 

50 TCTATAGTAC TTCTCAAAAG AAATTGTTAG ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

GCGTTACAAA CTAAAAACTT aAAAAgcaTG CCAATCTCTA TTCATCATAA TTGCGTCTTG 1404 0 

55 
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GTTCATGGCA 


TTTCTAGTTA 


CATGACGTCC 


ATGAATTAAG 


AAGTAAACAA 


GCATAGTAAT 


14160 


GATTGCTAAA 


GCGGCCATAA 


AGCCGAAGAT 


TTCACTATAT 


GAAAACATAT 


GAGTAAATAA 


14220 


CCCAAGGAAT 


GATGGACCGA 


AGCCGACACC 


TGCATCTAGA 


CCAACGTAAA 


AAGTAGATGT 


14280 


CGCGATACCA 


TATTTAATCG 


GGGGTGAGAC 


TTTTATCGCA 


ATAGATTGCA 


TTGCAGATGA 


14340 


TAAATTTCCA 


TACCCTAAAC 


CTAGGCAAGC 


ACCAGCAAGT 


AATATTAACC 


AGCTTTGATA 


14400 


GCTTGAAATT 


AAG CATACAA 


ATGAAAGGAA 


AAGCATGATA 


AATGCTGGGT 


AGACAATAAT 


14460 


ATTTTCATTT 


TTATCATCCA 


T CAATCTACC 


AGCAATAGGT 


CTAGTAATTA 


ACGATGCTAT 


14520 


AGCATAGCAA 


ATAAAGAAAT 


AGCTTGCTGC 


AGTGACTAGG 


TGTCGCTCTA 


AAGCAAATGC 


14580 


TTGTAAATAA 


GTTAGGATGG 


ACGCATAGGT 


AACGCCAATT 


AAAAGCATAA 


TTACAGCAAC 


14640 


AGGAATGGCC 


TCTTTTGCAA 


TAAATTGATG 


AATACTAAAT 


CTTGGTTTAT 


CAATGACATT 


14700 


AGTTTCAGTT 


TTGTTATTTG 


TTACTTCGAA 


ATCAACTTTT 


ATAAATAATG 


AGATAATGAG 


14760 


TCCGAGTATG 


CCTAATATGA 


CACAAATAAT 


AAACAGTAAG 


TCAATTGCGT 


ATTTTGTAAT 


14620 


AAGTAACATG 


CCTAGAAATG 


GGCCAATCGC 


TGTACCTAAT 


ACTAAACTTA 


AGGAAAATAA 


14880 


ACTGATGCCT 


TCACTTTTTC 


TATTAACAGG 


GGTAACGTAT 


GCCGCAATAG 


TACCTGTTGC 


14940 


AGTTGTCACA 


ACTGCAGTTG 


CGATACCGTT 


TATGAGACGT 


ACAAAGATTA 


AAAAAGCTAA 


15000 


AGATCCATCA 


ATAAAATAAA 


GTAATTGCGT 


GATAATTAAA 


GCAATTAAAC 


CAATAAATAA 


15060 


TAATCGTTTA 


GGTCCrATTT 


SATTTACAAA 


TTTACCTGTA 


GCAAATCGA 




15109 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 

( B ) — TY P E : — nu e-1 e i c — a c i d 

(C) STRAND EDNE S S : double 
- (D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 60 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 120 

CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 24 0 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 300 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 360 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 4 BO 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 54 0 

CATGATAAGT AATTAATGAG T AAAG CAT AC CGGTTATACA ACAACATACA AGATGACACG 6 00 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 660 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 72 0 

GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 78 0 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 84 0 

TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

AAAATATAAA ATTCTACATA ATCAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 114 0 

TTTGTTTAAC AATTGCGATA ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 1200 

TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 126 0 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 1320 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 13 8 0 

30 GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 144 0 

TTGCGTAGAA AT ATTGG CTA TGTTATTCAA CAAATTGGCT TAATG CCTCA TATGACGATT 150 0 

AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 156 0 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 1620 

GCAGAACTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 16 8 0 

CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 1740 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 18 00 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 1860 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 1920 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 1980 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 2040 

ATGAGACAAA AACGTGTTGA TACT A TTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 2100 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 2160 



25 



40 



45 
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ATTTTAAAAA 


GAAACGTTAG 


GAATGTACCT 


GTCGTAGATG 


ATCAACAGCG 


TTTAGTAGGA 


2280 




CTGATTACGC 


GTGCCAATGT 


TGTTGATATT 


GTATATGACA 


CGATTTGGGG 


CGATAGTGAG 


2340 


s 


GATACAGTGC 


AAACAGAACA 


TGTGGGGGAA 


GACAcTGCGT 


CCTCAAAAGT 


GCATGAGCAA 


2400 




CACACTACTA 


ATGTCAAAGT 


ACGTGACATA 


GGAGATGATA 


AATCATGATT 


GAGTTCCTAC 


2460 


10 


ATGAACATGG 


TGGACAGTTG 


ATGTCGAAAA 


CACTGGAACA 


TTTCTATATT 


TCTATAGTGG 


2520 


CATTATTACT 


TGCCATCATT 


GTTGCAGTAC 


CT AT AGG CAT 


TTTATTATCA 


AAAACAAAGC 


2580 




GAACTGCCAA 


TATTGTATTA 


ACTGTGGCAG 


GTGTCTTACA 


AACTATTCCA 


ACACTAG CTG 


2640 


15 


TACTTGCTAT 


TATGATACCG 


ATTTTTGGTG 


TTGGTAAAAC 


GCCTGCAATT 


GTAGCGCTAT 


2700 


TTATTTATGT 


ATTATTACCT 


ATTTTAAATA 


ACACGGTACT 


CGGTGTTCAA 


AATATTGATA 


2760 




GCAACATTAA 


AG AAG CTGGA 


AAAAGTATGG 


GAATGACACA 


ATTTCAATTG 


ATGAAGGATG 


2820 


20 


TTGAATTGCC 


GTTAGCATTG 


CCGCTTATCA 


TTGGTGGCAT 


TCGTTTGTCA 


TCTGTGTATG 


2880 




TAATTAGTTG 


GGCTACACTT 


GCAAGTTATG 


TAGGTGCGGG 


TGGATTAGGT 


GATTTCATTT 


2940 




TCAATGGTTT 


AAATTTATAT 


GATCCACTGA 


TGATTGTAAC 


TGCAACGGTA 


CTCGTTACTG 


3000 


25 


CACTAGCATT 


AGGTGTTGAT 


GCCTTATTAG 


CTTTAGTTGA 


AAAATGGGTA 


GTTCCCAAAG 


3060 




GCTTAAAAGT 


ATCTGGATAA 


TTAGGAGGCT 


AAGATAATGA 


AGAAAATTAA 


ATATATACTT 


3120 




GTCGTGTTTG 


TCTTATCGCT 


TACCGTATTA 


TCTGGATGTA 


GTTTGCCCGG 


ACTAGGTAGT 


3180 


30 


AAGAGCACGA 


AAAATGATGT 


CAAAATTACA 


GCATTATCAA 


CAAGCGAATC 


GCAAATTATT 


3240 




TCACATATGT 


TACGGTTGTT 


AATAGAGCAT 


GATACACACG 


GTAAGATAAA 


G CCAACATTA 


3300 




GTAAATAATT 


TAGGGTCAAG 


TACGATTCAA 


CATAATGCCT 


TAATTAATGG 


GGATGCTAAT 


3360 


35 


ATATCAGGTG 


TTAGATATAA- 


TGGCACAGAT" 


TTAACGGGAG" 


CTTTGAAGGA AGCACCAATT 


3420 




AAAAATCCTA 


AGAAAGCAAT 


GATAGCAACA 


CAACAAGGAT 


TTAAAAAGAA 


ATTTGATCAA 


3480 




ACGTTTTTTG 


ATTCGTATGG 


TTTTGCGAAT 


ACGTATGCAT 


TCATGGTAAC 


GAAGGAAACC 


3540 


40 


GCTAAAAAAT 


ATCATTTAGA 


GACAGTTTCA 


GATTTAGCAA AGCATAGTAA 


AGATTTACGT 


3600 




TTAGGTATGG 


ATAGTTCATG 


GATGAATCGT 


AAAGGCGATG 


GCTATGAAGG 


ATTTAAAAAA 


3660 


45 


GAGTATGGTT 


TTGACTTTGG 


TACAGTGAGA 


CCAATGCAAA 


TAGGTCTAGT 


CTACGACGCA 


3720 




TTAAACTCAG 


AGAAGTTAGA 


CGTTGCATTA 


GGTTATTCTA 


CAGATGGTCG 


AATTGCGGCG 


3780 




TATGATTTGA 


AAGTACTTAA 


AGATGATAAA 


CAATTTTTCC 


CACCTTATGC 


TGCGAGTGCT 


3840 


50 


GTTGCAACAA 


ATGAATTATT 


ACGG CAACAC 


CCAGAACTTA 


AAACGACGAT 


TAATAAGTTG 


3900 




ACAGGAAAGA 


TTTCGACTTC 


AGAGATGCAA 


CGCTTGAATT 


ATGAAGCGGA 


TGGTAAAGGT 


3960 
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AAAGGTGGTC 


ATAAGTAATG 


GAAGGTAATT 


TATTACAGCA 


ATTATTCAAT 


TATTATGTTA 


4080 




CGAACTTTGG 


TTATCTATGG 


GATTTATTTT 


TCAAACACTT 


ATTAATGTCT 


GTCTATGGTG 


4140 


5 


TGCTGTTTGC 


AgCTTTAATT GGTATTCCAT 


TGGGAATCTT 


GCTTGCaAGA 


TACACAAAAC 


4200 




TTTCTGGATT 


TGTAATTACA 


ATTGCAAATA 


TAATTCAAAC 


AGTTCCAGTC 


ATTGCAATGT 


4260 


10 


TAGCTATTTT 


AATGTTAGTC 


ATGGGCTTAG 


GTTCAGAAAC 


AGTAGTTTTA 


ACAGTGTTTT 


4320 


TATATGCGTT 


ACTTCCAATT 


ATAAAAAACA 


CTTATACTGG 


TATAGCTAGT 


GTTGATGCGA 


4380 




ATATTAAGGA 


TGCTGGCAAA 


GGTATGGGAA 


TGACACGCAA 


TCAAGTGCTA 


CGAATGATTG 


4440 


15 


AATTACCGTT 


ATCTGTTTCG 


GTTATTATCG 


GTGGCATTCG 


TATTGCCTTG 


GTTGTTGCGA 


4500 




TAGGTGTTGT 


TGCCGTTGGA 


TCATTTATAG 


GAGCACCTAC 


GCTTGGTGAC 


ATTGTGATTC 


4560 




GTGGTACAAA 


TGCGACGGAT 


GG CACAACGT 


TTATTTTAGC 


AGGTGCGATT 


CCGATTGCTA 


4620 


20 


TCATTGCAAT 


CGTCATTGAT 


GTACTATTAA 


GATTTTTAGA 


AAAACGATTA 


GACCCAACAA 


4680 




CACGACATCG 


TAAAAATCAA 


TCTAATCATC 


GGCCGCAAAG 


TATTAATATG 


TAATAGTAGA 


4740 




AGATGTTTAT 


AATTTAGCGA 


TTTCGTTTCA 


TGATTTATAA 


AAAATGAGGC 


TACT CAAGG A 


4800 


25 


GCTCAAATAA 


TCTTTGAGTA 


GCCTTTTTAT 


AGGTTGTGTT 


TGTATGCGTT 


TACACTAAAA 


4860 




TAG CAATT AT 


TATCATGAAA 


GTTTTTGGAT 


AAAAAGGGTT 


AATTATTGTA 


AAAATACTAA 


4920 




AAAATGAGAT 


GTTTTATTTA 


TAATTTTCTG 


CAAATTTATG 


ATATTGTTTC 


TTAATATATC 


4980 


30 


ATATTAAAAA 


TTTGTTTTTC 


TTAAACATAG 


GAGGCTTATC 


TAATTCATGG 


ACACATCAAA 


5040 




ACAATTTAGA 


GGTGACAACC 


GATTGCTTTT 


GGGTATCGTT 


TTAGGGGTTA 


TTACCTTTTG 


5100 




GCTATTCGCG 


CAGTCACTTG 


TTAATCTTGT 


TGTCCCATTA 


CAATCAACAT 


ATAGTAGTGA 


5160 


35 


CGTTGGAACG 


ATAAATATCG 


CTGTTAGCTT 


ATCTGCCTTA 


TTTGCTGGTT 


TGTTTATCGT 


5220 




AGGTGCTGGT 


GATGTTGCTG 


ATAAATTTGG 


TCGCGTCAAA 


ATTACTTATG 


TAGGATTGAT 


5280 


40 


ATTAAATGTT 


GTAGGTTCAT 


TACTCATCAT 


CATTACACCT 


TTGCCAGCAT 


TTTTAATTAT 


5340 


AGGTAGAATA 


ATTCAAGGTT 


TGTCTGCAGC 


ATGTATTATG 


CCATCAACAC 


TTGCTATTAT 


5400 




TAACGAATAT 


TATATTGGTA 


CAAGAAGACA 


ACGTGCCTTA 


AGCTATTGGT 


CTATTGGTTC 


5460 


45 


TTGGGGTGGT 


AGTGGTATTT 


GTACGTTGTT 


TGGTGGCTTA 


ATGGCTACAT 


ATATAGGTTG 


5520 




GCGTTCAATA 


TTTGTTGTTT 


CAATTCTATT 


AACATTATTA 


GCAATGTACT 


TAATCAAACA 


5580 




TGCACCTGAG 


ACTAAAGCAG 


AACCAATCAA 


AGGTATGAAA 


GCAGAAGCTA 


AAAAGTTTGA 


5640 


50 


CGTTATTGGT 


TTAGTCATTT 


TAGTAGTGAC 


GATGTTAAGT 


TTAAATGTAA 


TCATCACACA 


5700 




GACGTCTCAT 


TTTGGTTTAG 


TTTCACCGTT 


AATT CTAGGT 


TTAATTGTTG 


TGTTTATCTG 


5760 



55 



372 

JNSDOCID: <£P 07865 19A2_I_> 



EP 0 786 519 A2 





AATTTTTAAA 


AATAGAGGAT 


ACAGTGGTGC 


AACTATTTCA 


AACTTCTTAT 


TAAATGGTGT 


5680 




AGCAGGTGGT 


GCACTTATCG 


TTATTAACAC 


GTATTATCAA 


CAACAATTAG 


GATTTAATTC 


5940 


5 


TTCGCAAACG 


GGTTATATTT 


CATTAACGTA 


TTTAATAACA 


GTGTTGTCAA 


TGATTCGTGT 


6000 




AGGTGAAAAG 


ATTTTATCTC 


AACATGGTCC 


GAAGCGCCCA 


CTATTACTAG 


GAAGTGGCTT 


6060 




TACAGTGATT 


GGG TTAATCT 


TATTGTCGTT 


AACATTTTTA 


CCAGAAGTGT 


GGTATATCAT 


6120 


10 


ATCTAGTATA 


G TTGG AT ATT 


TATTGTTTGG 


TACTGGTTTA 


GGATTATATG 


CTACACCATC 


6180 




AACTGATACA 


GCAGTTGCTA 


GTGCGCCAGA 


TGATAAGTCG 


GGTGTTGCTT 


CAGGTGTGTA 


6240 


15 


TAAAATGG CG 


TCATCATTAG 


GAAATGCATT 


TGGAGTAGCA 


GTATCTGGTA 


CGGTTTATAC 


6300 


TGTGITAGCA 


GCTAATTTAA 


ATTTGAACTT 


AGGTGGTTTC 


ACAGGTATGA 


TGTTTAATGC 


6360 




CTTGCTAGCA 


ATTGTTG CAT 


TTTTAGTCAT 


TTTACTATTA 


GTTCCTAAAA 


ATCAAACGAA 


6420 


20 


TTTGTAAAAC 


TGAAATGAAA 


GCAAGTTATT 


ATGTAGGGAT 


TTTAAAGGAA 


ATTTTGTGAA 


6480 




AGTAAGTTTA 


TCATACACAC 


TTAATGTTGC 


GTATTGACGT 


TTAATGTTAG 


GTGTGTTCTT 


6540 




TTATAGACGA 


TAAAAGCTGT 


GTGCATATTA 


AGCGAATGAT 


TTTCAAATTG 


ACGCTAATAT 


6600 


25 


GCGAAAGTAG 


TATTTTTAAA 


ATGAACAACA 


ACGATGAAGA 


GGGGTTTATA 


GGATGAAAAT 


6660 




TGCAATTGCT 


GGATCGGGTG 


CATTAGGTAG 


TGGCTTTGGT 


GCCAAACTAT 


TTCAAGCAGG 


6720 




ATATGATGTC 


ACACTTATTG 


ACGGATATAC 


ATCTCATGTT 


GAAG CGGTTA 


AG CAACATGG 


6780 


30 


ATTAAATATA 


ACGATTAATG 


GAGAGGCATT 


CGAGTTAAAC 


ATTCCGATGT 


ATCATTTTAA 


6840 




TGATCAACCG 


GACGAAAGCA 


TTTACGATGT 


TGTCTTTCTA 


TTTCCAAAGT 


CTATGCAATT 


6900 




AAAAGAAGTG 


ATGGAAGATA 


TGAAGCCACA 


TATTGATAAT 


GAAACGATCG 


TCGTATGTAC 


6960 


35 


GATGAATGGT- 


-GTGAAGGATG- 


-AAGAAGTGAT- 


TGCGCAGTAT- 


"GTTGCTCAAT" 










CACAAATTGT 


7020 




CAGAGGTGTT ACGACTTGGA 


CGGCAGGTCT 


TGAAAGCCCT 


GGACACAGTC 


ATTTACTTGG 


7080 




TAGTCGACCA 


GTTGAAATAG 


GTGAACTAGT 


GGATGAAGGT 


AAAGAAAATG 


TTATAAAAGT 


7140 


40 


TGCTGATTTA 


CTTAACGAAG 


CGGAATTGAA 


TGGTGTCATT 


AGTAAAGATT 


TATACCAATC 


7200 




GATTTGGAAA 


AAGATTTGTG 


TTAATGGTAC 


GGCAAATGCA 


TTAAGCACAG 


TGTTGGAGTG 


7260 


45 


TAATATGGCA 


TCGCTGAATG 


AAAGTAGTTA 


TGCGAAGTGT 


TTGATTTATA 


AATTAACGCA 


7320 


AGAAATAGTG 


CATGTAGCGA 


CGATTGATAA 


TGTTCATTTA 


AATGTTGATG 


AAGTATTTGA 


7380 




ATATTTAGTT 


GATTTAAATG 


AAaAAGTTGG 


TGCGCATTAT 


CCATCCATGT 


ATCAAGATTT 


7440 


SO 


AATTGTTAAT 


AATAGAAAAA 


CTGAAATTGA 


TTATATTAAT 


GGCGCAGTTG 


CAACATTAGG 


7500 




TAAACAACGT 


CaTATTGAAG 


CGCCAGTCAA 


TCGCTTTATT 


ACTGATTTAA 


TTCATACTAA 


7560 
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CAATCACGTG 


ATATTACGGT 


CATTATTAAG 


ATTGAAATGT 


AATAAATAAA 


GAACAGCAGT 


7680 


AAGGTACTTT 


CAAATTGAAA 


TGATCTTGGT 


GCTGTTTTTC 


TTGATTGATC 


TTCGTCATAA 


7740 


TTCAGATTTG 


TCATAGGcTA 


CGACATACTA 


TTAGTATTTA 


CTAGACAGTT 


TTTACGACGA 


7B00 


CACTTTGAAA 


AATTTTGAGG 


CAAATCATTT 


GGAAGTCTCA 


CGTGAATTTT 


GTAAACTCAT 


7860 


CAAGCAAGTA 


ATTATATTAA 


AAAGACAAAT 


AGAGAAAAGG 


TG TTT AT AAT 


GAGTAAAATT 


7920 


TTTGTAACTG 


GTGCAACGGG 


CCTTATTGGC 


ATTAAATTAG 


TTCAAAGACT 


AAAAGAAGAG 


7980 


GGGCATGAGG 


TTGCTGGTTT 


TACTACATCT 


GAGAATGGTC 


AACAAAAGCT 


AGCTG CTGTT 


8040 


AATGTAAAAG 


CATATATTGG 


TGATATATTA 


AAAGCTGATA 


CTATTGATCA 


AGCGTTAGCA 


8100 


GATTTTAAAC 


CAGAAATCAT 


TATCAAT CAA 


ATTACGGATT 


TAAAAAATGT 


TGATATGGCA 


8160 


GCAAATACGA 


AAGTACGTAT 


TGAAGGTTCT 


AAAAACCTAA 


TTGATGCGGC 


GAAAAAGCAT 


8220 


GACGTTAAGA 


AAGTAATTGC 


CCAAAGTATT 


G CCTTTATGT 


ATGAACCTGG 


CGAAGGATTA 


8280 


GCAAATGAGG 


AAACTTCACT 


TGATTTTAAC 


TCAACTGGCG 


ATAGAAAAGT 


AACGGTTGAT 


8340 


GGTGTGGTTG 


GTTTAGAAGA 


AGAAACGGCT 


CGTATGGATG 


AATACGTTGT 


TTTACGTTTT 


8400 


GGCTGGTTAT 


ATGGCCCAGG 


TACTTGGTAC 


GGAAAAGATG 


GCATGATTTA 


TAATCAATTT 


8460 


ATGGATGGTC 


AAGTGACACT 


TTCAGATGGC 


GTAACATCAT 


TTGTGCATCT 


TGATGATGCA 


8520 


GTTGAAACAT 


CTATTCAAGC 


T ATT CATTTT 


GAAAATGGTA 


TCTATAATGT 


AGCAGATGAT 


8580 


GCACCTGTTA 


AAGGTTCTGA 


ATTTGCAGAA 


TGGTATAAAG 


AACAACTTGG 


TGTTGAACCA 


8640 


AATATTGATA 


TTCAACCTGC 


GCAACCATTT 


GAACGTGGCG 


TAAGCAATGA 


GAAGTTTAAA 


8700 


GCGCAAGGTG 


GTACTCTGAT 


TTATCAAACT 


TGGAAAGATG 


GCATGAATCC 


AATTAAATAA 


8760 


TAATTTATCC 


GTTTAATATA 


CAAAGAATAA 


AGACTTGGTC 


GAATCGTGGA 


TGATATATTA 


8820 


TCAnACvj vJAC 




AGTCTTTTTT 


ATTATGTCTT 


CGTTATCTTT 


GTATGAAGGA 


8880 


ATAACAGAAT 


TACAATTAAT 


GTACTGAATA 


. ATGCAATTAA 


TGTTGTGATT 


AGTGCTAATT 


8940 


TAATTTCTAT 


TGGTAGCCAA 


. GTCAGTACAA AAGACCAATT 


• ATTGCTACCG 


AGAATGAGAT 


9000 


ATGGTAATGC 


ATATAATATG 


1 AGCGCTAAAG CGATACATAT ACATAATGAT 


' AACCAACTCA 


9060 


ATACAGCAAT 


CC 










9072 



45 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 
£0 <B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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300 
360 
420 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT 
TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TCTAAATGAA AGTGACAATA 
TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 
TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 
TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 
TGATCCAAAC TTTAAATACA AAACACG CAT CAATAACATT ACAAG CGAAT TATATGACGC 
TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 4 80 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 54 0 

TGATTTTATG TCAGTATTTT TATC CGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 660 
AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 
25 GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGAT CATC AATTTCTTTA 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTG 840 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

TTTCTAAAAA— TGCrGACAAT~ATCCATGTGA~TT^ ^— — 

AAGACAATCG TATCTATAAT GACACATTTA AAG CTTACCG TGAGCAATAC GACAAAATTT 1200 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 1260 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 1320 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 1380 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 144 0 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 15 00 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

so ATGCGGGTAT C CAAATTG AT AATGG CGATG CACATGCCAT TTATAACTTC ATCAACACTC 1620 

ACTCGAGTAA GGAATTG CAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 1680 



720 
780 
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AAGCGATTAT 


TCGATGTAGT 


GAGTTCAATA 


TATGGTTTAG 


TAGTTTTAAG 


TCCGATTCTG 


1800 




TTAATTACAG 


CATTACTAAT 


TAAAATGGAa 


TCACCTGGAC 


CAGCCATTTT 


CAAACAAAAA 


1860 


5 


AGACCGACGA 


TTAATAATGA ATTGTTTAAT 


ATTTATAAGT 


TTAGATCAAT 


GAAAATAGAC 


1920 




ACACCTAATG 


TTGCAACTGA 


TTTAATGGAT 


TCAACATCGT 


ATATAACAAA 


GACAGGGAAG 


1980 


10 


GTCATTCGTA 


AGACCTCTAT 


TGATGAATTG 


CCACAATTAT 


TGAATGTTTT 


AAAAGGAGAA 


2040 


ATGTCAATTG 


TAGGTCCTAG 


ACCAGCGCTT 


TATAATCAAT 


ACGAATTAAT 


CGAAAAACGT 


2100 




ACAAAAGCGA 


ACGTG CATAC 


GATTAGACCA 


GGTGTGACAG 


GACTAGCTCA 


AGTGATGGGG 


2160 


IS 


AGAGATGATA 


TCACTGATGA 


TCAAAAAGTA 


GCGTATGATC 


ATTATTACTT 


AACACATCAA 


2220 


TCTATGATGC 


TTGATATGTA 


TATCATATAT 


AAAACAATTA 


AAAATATCGT 


TACTTCAGAA 


2280 




GGTGTGCATC 


ACTAATGAGA 


AAAAATATTT 


TAATTACAGG 


CGTACATGGA 


TATATCGGTA 


2340 


20 


ATGCTTTAAA 


AGATAAGCTT 


ATTGAACAAG 


GACATCAAGT 


AGATCAAATT 


AATGTTAGGA 


2400 




ATCAATTATG 


GAAGTCGACC 


TCGTTCAAAG 


ATTATGATGT 


TTTAATTCAT 


ACAGCAGCTT 


2460 




TGGTTCACAA 


CAATTCAC CT 


CAAGCAAGGC 


TATCTGATTA 


TATGCAAGTG 


AATATGTTGC 


2520 


25 


TGACGAAACA 


ATTGGCACAA 


AAGGCTAAAG 


CTGAAGACGT 


TAAACAATTT 


ATTTTTATGA 


2580 




GTACTATGGC 


AGTTTATGGA 


AAAGAAGGTC 


ATGTTGGTAA 


ATCAGATCAA 


GTTGATACAC 


2640 




AAACACCAAT 


GAACCCTACG 


ACCAACTATG 


GTATTTCCAA 


AAAGTTCGCT 


GAACAAGCAT 


2700 


30 


TACAAGAATT 


GATTAGTGAT 


TCGTTTAAAG 


TAGCAATTGT 


GAGACCACCA 


ATGATTTATG 


2760 




GTGCACATTG 


CCCAGGAAAT 


TTCCAACGGT 


TAATGCAATT 


GTCAAAGCGA 


TTGCCAATCA 


2820 




TTCCCAATAT 


TAACAATCAG 


CGCAGTGCAT 


TATATATTAA 


ACATCTGACA 


GCATTTATTG 


2880 


35 


ATCAATTAAT 


ATCATTAGAA 


GTGACAGGTG 


TGTACCATCC 


TCAAGATAGT 


TTTTACTTTG 


2940 




ATACATCGTC 


AGTAATGTAT 


GAAATACGTC 


GCCAATCACA 


TCGTAAAACG 


GTATTGATCA 


3000 


40 


ACATGCCTTC 


AATGCTAAAT 


AAGTATTTTA 


ATAAGTTGTC 


GGTCTTTAGA 


AAATTATTCG 


3060 


GCAATTTAAT 


ATACAGCAAT 


ACGTTATATG 


AAAATAATAA 


TGCACTTGAA 


ATTATTC CTG 


3120 




GAAAAATGTC 


ACTTGTTATT 


GCGGACATCA 


TGGATGAAAC 


GACAACCAAA 


G AT AAGG CAT 


3180 


45 


AAGTCATCTA 


TTAAATAAAA 


TCAACATACA 


AATCGTTTTA 


TTTGGAGGTT 


ATAGTATGAA 


324 0 




GTTAACAGTA 


GTTGGCTTAG 


GTTATATTGG 


TTTACCAACA 


TCAATTATGT 


TTGCAAAACA 


3300 




TGGcGTCGAT 


GTGCTTGGTG 


TTGATATTAA 


TCAGCAAACG 


ATTGATAAGT 


TACAAAGTGG 


3360 


SO 


TCAAATTAGT 


ATTGAAGAAC 


CTGGATTACA 


AGAGGTTTAT 


GAAGAGGTAC 


TGTCATCGGG 


3420 




AAAATTGAAG 


GTATCTACAA 


CGCCAGATGC 


ATCTGATGTT 


TTTATCATTG 


CCGTTCCGAC 


3480 
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TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATTATT GTAGAGTCGA CAATTGCGCC 36o0 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3660 

AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 3720 

AGTTCATAAC AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 3780 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 3840 

GAGTAAGCTA ATGGAAAACA CATATAGAGA CGTGAACATT GCTTTAGCTA ATGAATTAAC 3 900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATGG CAAACAAACA 3 960 

TCCGCGTGTT AACATCCATC AGCCTGGTCC AGGTGTAGGC GGTCATTGTT TAGCTGTTGA 4020 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 408 0 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAGCAAATCA TCAAAGTGTT 414 0 

GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 42 00 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 4260 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 4 320 

AGACGCATCG CTAGTATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 4 3 80 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 444 0 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAATATATTT AATTTTATCG ACAAATAAAA 4 500 

TGTGTCAAAC TAGGGCATAC ATGATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4 560 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4620 

AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 4680 

"TTACAGCACA - ACATAGAGAT~ATGTTAGATA GTGTGTTAAG TATATTTGAT ATTCAAGCTG 4740~ 

ATCMX5ATTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCGAATGCAC 4 800 

TTGCTAAACT TGATAGCATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4860 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4 920 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4 980 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 504 0 

ATTTACTTTT TGAAAACAAA GACAAAGAGC GTATCTTTAT TACTGGAAAT ACAGTTATTG 5100 

ACG CATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 5160 

GCAAGAAAGT TGTTTTACTA ACAGCGCATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 5220 

AGATTTTTAA AG CAGTAAGA GATTTGGCAG ATGAATATAA AGATGTTGT C TTCATTTATC 5280 
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GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 54 00 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 54 60 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 5520 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5580 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 5640 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAACCC CTAATCATGA AGTTGGTTTA GACAAC CAGC 57 6 0 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAG C CAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 58 80 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 594 0 

CCAAATAGTT AAGATTTTAA CTTCGT CTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT AC CTTGTCT A TTGTAAAAAC GTTCAATCGT 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT G C CAT AAACA TGGGCAATCA 6120 

CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAG CGGT 6180 

TTGTATGATA ACCATTATGA TTAATC CTAC ACGGACTGCA AGAACATCCA CCATATAAAT 624 0 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 63 00 

GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 63 6 0 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 6420 

CGTTGG CAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 6480 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT 654 0 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGGAA TGACTCAGTA 6600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 6660 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 6720 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 678 0 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 684 0 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATG CTAAG A GATTTATATT 6900 

ATAGC CAAT A AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 6960 

AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 7020 

GTGACTAATC CAGCAACTGG AGAAACACTA T CACAT ATT A CAAGAGCAAA AGATAAAGAT 7080 
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TCAGAACGTG 


CACAAATGTT 


GCGTGATATT 


GGTGATAAAT 


TAATGGCACA 


AAAAGATAAA 


7200 




ATTG CAATG A 


TTGAAACATT 


AAATAATGGT 


AAACCGATTC 


GTGAGACAAC 


AGCAATTGAT 


7260 


5 


ATTCCATTTG 


CTGCAAGACA 


TTTCCATTAT 


TTCGCAAGTG 


TTATTGAAAC 


AGAAGAAGGT 


7320 




ACAGTGAATG 


ATATCGATAA 


AGACACAATG 


AGTATCGTAC 


GACATGAGCC 


GATTGGCGTC 


7380 


10 


GTAGGTGCTG 


TTGTTGCTTG 


GAACTTCCCA 


ATGCTATTAG 


CTGCATGGAA 


GATTGCGCCA 


7440 


gCCATTGCTG 


CAGGTAATAC 


AATTGTGATT 


CAACCTTCGT 


CTTCAACACC 


ATTAAGTTTA 


7500 




TTGGAAGTTG 


CTAAAATTTT 


CCAAGAGGTA 


TTACCTAAAG 


GTGTTGTCAA 


TATACTAACG 


7560 


15 


GGTAAAGGTT 


CAGAATCAGG 


T AATG CAATT 


TTCAATCATG 


ATGGTGTAGA 


TAAATTATCA 


7620 




TTTACGGGCT 


CAACTGATGT 


AGGTTATCAA 


GTTGCCGAAG 


CTGCAGCAAA 


ACATCTAGTA 


7680 




CCCGCTACAT 


TAGAGCTTGG 


TGGTAAAAGC 


GCCAATATCA 


TATTAGATGA 


TGCTAATTTA 


7740 


20 


GACCTTGCAG 


TTGAAGGTAT 


TCAGTTAGGT 


ATTTTATTCA 


ACCAAGGTGA 


AGTATGTAGT 


7800 




GCAGGTTCTC 


GATTATTAGT 


TCATGAAAAA 


ATTTATGATC 


AATTGGTGCC 


ACGTTTACAA 


7860 




GAGG CATTTT 


CAAATATTAA 


AGTTGGAAAT 


CCACAAGATG 


AAGCTACACA 


AATGGGTAGT 


7920 


25 


CAAACTGGTA 


AGGATCAATT 


AGATAAAATT 


CAATCATATA 


TTGATGCAGC 


AAAAGAATCA 


7980 




GATGCACAAA 


TTTTAG CAGG 


CGGTCATCGC 


TTAACTGAAA 


ATGGATTAGA 


TAAAGGGTTC 


8040 




TTCTTTGAGC 


CGACATTAAT 


TGctGTGCCA 


GACAATCATC 


ACAAATTAGC 


ACAAGAAGAA 


8100 


30 


ATATTTGGAC 


CAGTGTTAAC 


AGTGATTAAA 


GTGAAGGACG 


ATCAAGAAGC 


AATTGATATA 


8160 




GCTAATGATT 


CTGAGTATGG 


TTTAGCAGGC 


GGTGTATTTT 


CTCAAAATAT 


CACACGTGCA 


8220 


JO 


TTAAATATTG 


CTAAAGCTGT 


ACGTACAGGA 


CGTATTTGGA 


TTAACACTTA 


CAACCAAGTA 


8280 




CCAGAAGGCG 


CACCATTTGG 


TGGTTATAAA 


AAATCAGGTA 


TCGGTCGAGA 


AACTTATAAA 


8340 




GGTGCGTTAA GTAACTATCA 


ACAAGTTAAA 


AATATTTATA 


TTGATACAAG 


CAATGCTTTA 


8400 


40 


AAAGGTTTGT 


ACTAGAATAA 


ATATCGTTTC 


TGAAG CGTGT 


TTGTAGGTCA 


GTCTAGCGGT 


8460 


AAGTCTTAAC 


ATTTAACGGC 


GTTGTTTAGA 


TTTTAAGCAA 


AACAAAATAT 


ATAGGAACAC 


8520 




GTATCATGAT 


ATTAGGATAT 


AATGACTAAA 


ATAATAGCAG 


TAGGATGGTT 


TTTAATTGCA 


8580 


45 


AATCATCTTA 


CTGCTGTTTT 


TAATTATGCT 


AATTTGCGAT 


GCGG CTATTA 


TAAGGACAGA 


8640 




GTTGTTTATT 


AATTATGGTG 


ATTTAGAAAT 


ATGAAGTTCA 


ATATGCAAAG 


TCATCGTTTG 


8700 




TTTTAATATG 


CGGAACAATC 


ATTAAAGTTA 


TTGCGATTTT 


TTGAACTTAA 


TGAAACTAAA 


8760 


50 


CAATAAATTT 


GAGATACTTT 


TTTGTCATTT 


TTATGTAACT 


AACACAATAA 


TCTCGTACAT 


8820 




TATTAAAATT 


TTCTATATGA 


TAGGAATAAA 


GCAAAGCGCG 


AGTGTGCTGT 


AAAAGTTTTC 


8880 
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GATGATGTAT AAATCATGGT TAATTACGGA AGCATTAATA TTAACCTGAG AAGCTATAAA 90 00 

GAATTATTTT TAAAAGCGAC AATATTAAAT ACGACGCATT TATTTAGGAG TGGCAAACGT 9060 

ATGAATGGGA AAAAGGCGAA TACGATAAAC AGATACAAAT ATTTTCATCA TGTCAATCAT 9120 

CAAAAAATTC AACAAAGTTC TAAAAAGACG CTGTGGGCAT CACTAATCAT CACATTGTTA 9180 

TTTACAGTGA TTGAATTTGT CGGAGGTTTA GTATCTAATt CATTGGCATT ACTGTCAGAT 924 0 

TCATTTCATA TGCTTAGTGA TGTATTAGCA CTTGGTTTAT CTATGTTGGC CATTTATTTT 93 00 

GCAAGTAAAA AGCCGACTGC ACGATACACA TTTGGATATT TAAGATTTGA GAT ATT AG CT 93 60 

GCATTTTTAA ATGGTTTAGC ATTAATTGTA ATTTCAATCT GGATTTTATA TGAAGCTATT 94 20 

GTACGTATTA TTTATCCGCA ACCAATTGAA AGTGGCATTA TGTTTATGAT TGCTAGTATT 94 80 

GGTTTACTCG TCAATATTAT TTTGACTGTT ATCCTTGTAA GGTCTTTAAA ACAAGAAGAC 954 0 

AATATCAATA TTCAAAGTGC ATT ATGG CAT TTCATGGGAG ACTTATTGAA CTCTATTGGT 96 00 

GTCATCGTTG CAGTTGTATT GATTTACTTT ACAGGATGGC G CAT CAT CG A CCCAATCATT 966 0 

AGTATTGTAA TTTCACTCAT CATTTTACGT GGTGGTTATA AAATTACGCG TAATGCgTGG 972 0 

tTAATTTTAA TGGAAAGTGT GCCTCAACAT TTGGATACTG ATCAAATTAT GGCAGATATT 978 0 

AAAAACATAG ATGGCATATT AGATGTACAT GAATTTCATT TGTGGAGTAT TACAACAGAG 984 0 

CATTATTCAT TAAGTGCCCA TGTTGTGTTA GATAAAAAAT ATGAGGGTGA TGATTATCAA 9900 

GCGATTGATC AAGTATCATC ATTGTTGAAA GAAAAATATG GCATTG CACA TTCAACGTTG 9 96 0 

CAAATTGAAA ACTTGCAATT GAATCCATTA GATGAGCCAT ACTTCGACAA ATTAACATAA 10020 

ATAAAACATT GTAGCGCCTA AAACATTAAT CTATGTCATA GG CGCACGTT TCGTTTTATA 1008 0 

CTTATGTTGC ATCATTTAAA TGATTTTCGT CAATTTCTTT GATGCTATCT ACATCTAACA 1014 0 

CGAGATCTTT AGGTTTCAAA ATATGAATAT GTTTTTCATC ATTTGTATGT AAAATGCGTT 10200 

CTATGATGTA CCTTTGACCG GCCATTGTTT CTACAGCAAT CTTTTTGTTT CTAGCTAAAC 10260 

TTGCTACGAC AGATTCTTTA TCCATAATGA TAGCCCCCTA TATATATGTT TATTTACTTA 10320 

TACCCTAACA TGATTTTTAT ACTCTTTGAA AATATATTTT ACAGAATTTT ATCTAAATAT 10380 

TTAAAAAAAT ATCTTAATAT CCTTGTAATC CGATAAGAAT TATAGTAATA TTTTTTCAAC 10440 

CATtGTTATA GGAGGTCTTA TTAATGACAT TATTTTTATT AGAAGCTAAC AATCTTGATT 10500 

TTG CAT CAAC GAAAGAAGAA CTAGAAGCAA AGGCAG CATC ACTATCTACG AAGACAATTC 10560 

CAACATTAAT TGAAGTACAA GCTACTGAAA ATTTAACTCA TGGTTATTTT ATTGTGGAAG 10620 

CAAATGACGA aGCAGAAGCT AAACAATTTT TAACAGAAGC AGATATTAGT ATTCAATTAG 10680 
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TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10800 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10860 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10 980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGG CATG GATTTCGATT GCAGTTAATT 11040 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AG CTG AAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 11280 

TCCTATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 11340 

20 ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA G CAAGGG AG A 114 00 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA AT ATT AT CGG 114 6 0 

TGGATTTATT CATC CAT CGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 1152 0 

25 ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTG CCATGG A AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 11640 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 11700 

30 GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTG CAT AAG CCTAACGTTA TATTGATGGA 1176 0 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 1162 0 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 11880 

TTATCTTTCC~GACCGCATTG~TTCTGTTXGG — TGA^ 11940" 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAGCAAGTGA 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 1224 0 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGG CCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12360 

CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCATGAAG 12420 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 12600 

TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA C CATTCGGTG 126 60 

CACTGGGTGA AAAGTTAGGC AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 12720 

ATGCGTATTG CTGTGTG CTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CGCAAgCATT TGTACAAGAT TATAAAAAGT CTGGCTTTAA AATGAATGAT CGCAAGCAAA 1284 0 

GTGTAGACAT T ATGACG CAT CATTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 12900 

CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TGTATAAGGA GG CAT CGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 130 80 

TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 13140 

ACCTGTATTG TTACCGGGTC CTGCTCTTGT AGGAAAAAGT ATATGGTCTT TCATTGTTAC 13200 

TGGAGAAATT TTCCAACATT TAG CAATTAG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13260 

CGCATTGTTG GTTGCTATTC CATTGGG CTT CTTGCTTGGA AGGAATCGTT GGCTATACAA 1332 0 

CGCTATCGAA C CGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGGCACCATT 13380 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT GCCAGCGATT GCGATTATTT TTATCGCTGC 1344 0 

TTTTTTCCCA ATTG TGTTC A ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT 13500 

AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC 13560 

CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

TTTAGTTTCT GGTGAAATGA TTGGTGCACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 13680 

ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAG CAATA TT CTTTATCG GATTATTTGG 13740 

TTTTATTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13800 

ATAAGGAGAG ATGATGATGA CTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 13860 

AG T AGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAGCAAT TATTTGTAGA 13920 

TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 13980 

GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14040 

AAATGCCACG CAGC CACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

ATTAGGTGCT ACCGGATTGT CTAATCCGAT GAAGTCATTT AATGATTTAG AAAAGTTGAA 14160 

CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

TAATATTCAA GAAGACCATT ATTTTGGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 14280 
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TTTAGGAGTC 


AACGGGTCAG 


CAACGTATCA 


AATCACATTG 


AATCAAGTCG 


TAGTGCCACA 


14400 


c 


AT CAC AAATT 


ATCACGCATG 


ATGCGAAGCA 


GTTTGCGGCA 


ACTATTCGCC 


CGCAATTTAT 


14460 


o 


TGCTTACCAA 


ATTCCAATAG 


GATTAGGCTC 


AATTAAAAGT 


TCTTTAGAGT 


TAATTGATGC 


14520 




ATTTTCAAAT 


GTGCAAAACG 


GAATAAATCA 


ATATTTAGAG 


TATGATGTTG 


AAGCTTTTAA 


14580 


10 


AAAACGTTAT 


CGTCAACTTA 


GAGAGGAATA 


TTATGCAATA 


TTAGATGACG 


GTAACTTAAC 


14640 


TTCACATTTA 


AATGAATTAA 


TATCATTGAA 


GAAGGACATC 


GGCTATTTAT 


TGTTAGATGT 


14700 




AAATCAAGCT 


TCTGTTGTCA 


ATGGTGGTTC 


TAGAGCGTAC 


ACACCATATT 


CGCCACAAGT 


14760 


15 


TCGCAAGTTA 


AAAGAAGGAT 


TCTTCTTCGC 


AGCATTGACA 


CCGACATTAA 


GACATTTAGG 


14820 




TAAACTTGAA 


GCAGAGTTGA 


AGGGGTAAGT 


GTGATAAGCT 


GATTTTTTGT 


TTAGATGCGT 


14880 




TTGTTGAAAC 


ATTTTTTAAA 


ATAATATAAA 


TCTTAGTTTA 


TAAACATTTT 


CTGTTAATTT 


14940 


20 


GTTATATCCT 


TTTAACTAGG 


AAAATATACA 


TTTCGTAATA 


ATAATAATCG 


TTATCATTGA 


15000 




AAAAGTGTTA 


ATAAGGTGTA 


TAATGAAAAT 


GTGAACAATT 


AATGAACTTC 


TTATTTTAAA 


15060 




GAAGGTGAAT 


ACTATAGATA 


CG CAT ACT AA 


AGAACAACAA 


TTCTCGAATC 


TAGTAAGATC 


15120 


25 


TTATCGTAAA 


GAATACGTGG 


GTAAAGGACC 


CAATAGTATT 


CGAGTGTCGT 


TTAAAGATAA 


15180 




TTGGGCGATT 


GCACATATGA 


CAGGTGTTTT 


GAGTAAAGTT 


GAGAGTTTTT 


ACCTAAACGA 


15240 




CAAACGCAAT 


GAATCGATGC 


TCCATTATAC 


ACGCACAGAG 


AAGATTAAAC 


AGATGTATAA 


153 00 


30 


AGAAATAGAT 


GTAAATGAGA 


TGGAAAGTCT 


TGTAGGCGCT 


AAGTTTGTAA 


AATTATTTAC 


15360 




AGATATTGAT 


TTGAATGATG 


ATGAAGTCAT 


TTCAATATTT 


GTTTTCGATA 


AGTCAATAGA 


15420 




ATAAGTGTTG 


CTGGTGTAAG 


GTACACGGTG 


CTGTTTGCTA 


ACTTCGCTTT 


GAATTTAACA 


15480 


35 


ATAATTCAAG 


-GGGGTGGTAT- 


GTCAAACGGT- 


GGGGl-l-l-1-1-1- 


-TGTGATATTT- 


-TTAAAAGAAG 


15540 




CAA(3VTGCAA CACGTACTTT 


AAGGAAGTCA 


AAATTTATCA 


TTTAGGAGAG 


ATGGATATGA 


15600 




AAATCGTAGC 


ATTATTTCCA 


GAAGCAGTAG 


AAGGTCAAGA 


AAATCAATTA 


CTTAATACTA 


15660 


40 


AAAAAGCATT 


AGGATTAAAA 


ACATTTTTAG 


AGGAAAGAGG 


ACATGAGTTC 


ATTATATTAG 


15720 




CAGATAATGG 


TGAAGACTTA 


GATAAACATT 


TACCAGATAT 


GGATGTGATT 


ATTAGTGCGC 


15780 


45 


CATTTTATCC 


TGCATATATG 


ACTCGTGAAC 


GTATTGAAAA 


AGCACCGAAC 


TTGAAATTAG 


15840 




CAATTACAGC 


AGGTGTAGGA 


TCTGACCATG 


TAGATTTAGC 


GGCAG CAAGT 


GAACACAATA 


15900 




TTGGTGTCGT 


TGAAGTTACA 


GGAAGTAATA 


CAGTTAGTGT 


GGCAGAACAT 


GCGGTTATGG 


15960 


SO 


ATTTATTAAT 


ACTTCTTAGA 


AACTATGAAG 


AAGGTCATCG 


TCAATCAGTA 


GAAGGTGAAT 


16020 




GGAACTTGTC 


TCAAGTAGGT 


AATCATGCGC 


ATGAATTACA 


ACACAAAACA 


ATTGGTATTT 


16080 
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TAGAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 163 80 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 16440 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 1656 0 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACG CAATT AT 16 800 

TGAnAAATnT CATTCATGTG GnAATC 16 826 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATT GGC CAT A 60 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TAT/tfVAACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 180 

AG CTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 24 0 

40 ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 360 

AT CGGTTT AG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 420 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 480 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC GCTTTTGCTA 540 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA 780 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 84 0 

TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 900 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 960 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

CTTCACCATA TT CAATTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 114 0 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

AATGTCATAA TGATTG CAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 13 20 

ATATCAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 13 8 0 

TTAATGTGGT TGCTTGAGGA AAAATTTATT CATTGAAGTC AAGTTGGTTC ATTTTAGAAA 14 4 0 

TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT 1500 

AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA 1560 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAGCAT GGACATTTCA AAAATAAGAG 162 0 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA CAAACTGTAT AATTAAAGGT 16 80 

50 ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 174 0 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC G AAATGGG C A 180 0 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA 1860 

35 _ GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGTACCTA AATTACCTTA TACflCTATGT 1920 

TATATTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAG CGTT AAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 204 0 

GTTGAAGTTG ATAAAGGTAC AG ATG CTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 2160 

CCAATTGATA AAAATGATGA TGAGTTATTA AATAGTTTTG AACG C CG AAA TCGTTCAAAA 2220 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACGCGATG GCTTCTTAAC GCGTGATATT 234 0 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 2400 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG AAAGTAAATC AAGAATTGAA TGAACTTCAT 2460 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 2580 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

5 

TTTGCTGGCT CAAAATCATA TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 282 0 

10 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 288 0 

GTATTGAATC AGCCATTGTA CCAATTAATT GAGCAAGTTA AACCGCGTTT AACAAAAGCT 2 94 0 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

IS 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 3060 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

2Q TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 324 0 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 3300 

25 ACAATCAAGT AAGAT CATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 3420 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 34 80 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 354 0 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA ACGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

35 AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 3720 

CAACAAGAAC AACGGCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 37 80 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AAT ATTC CAT TAACGTGGCA 3840 

40 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 3900 

TTATCAGGCA TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 

45 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

CAATATAGGT CGCCGAGTTT CAACTa CATC AACTGGTTCA GTTACATTAG ATAATGCGCT 60 

AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 120 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 24 0 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 300 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 36 0 

TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT 4 20 

GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 4 80 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 54 0 

TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGT CGTG CAGAACAGCT 600 

TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 66 0 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 720 

25 GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 780 

TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 84 0 

AAATCCACAA ATTAAAGAAG AAATTGATCG TAAATTGAGA GAAAAATTAG GTATATCTGA 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG AATAGTACAC 96 0 

AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 1020 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 1080 

— AAGGl^l^l-i^A^l-i-iATTTA- 5 ITATTAC^TT-ATCAATAGTT-TTATAATCGA-GCTTCAAAAC ri~4"0 

TTTAGAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 1200 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 1260 

ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 13 20 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 13 80 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 1440 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 1560 

AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 1680 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1800 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA I860 

GTTCATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AG C AG ATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 1980 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 2340 

CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACG AG CTGG A 2400 

CTTTTACATG ATGTTGGTAA AG CAATTG AT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 24 6 0 

GGTGTAGAAT TAG CGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2 520 

CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 264 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 276 0 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2 880 

TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA At CTAGTTAG ACAGCACTTT 2940 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3060 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 3120 
TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 
TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TG CAATTG TT 3240 
GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 
TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3360 
GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 3420 
CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3 600 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

5 

AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3720 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3 780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3 840 

10 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3 900 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3 960 

^ GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4 020 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4 080 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 4140 

20 GGTATGACAG GTTTTTATGA TGGCATTTTA GG AATAAATA AAACAGAGGT AATTGAGCGT 4 200 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT 4 260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4320 

25 AATGATGACC ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 43 80 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 4440 

CTTTTTGTTA TCATTTAATA TGAAATATAT CCATAGGAGG CATATAACTA TGAAACCACA 4 500 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4 560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG 4 620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4 6 80 

— TAGTGATGAT— ! 33?AGATA ! FTT—' TGATTGCATT-TGAeeAAGAA-AeAATTGATG-rrAAeG^T^ 4740- 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAa CCAGA 4 800 

AGGATGTCAT GCACAGCTTA TTGAATT AC C TTTTACAGCA AC CGCTAAAG AATTAGGTAC 4860 

40 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4 920 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGT AG TTG A 4980 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 5040 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 5100 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

so TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 

55 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 5400 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5520 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 5580 

10 TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 5640 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 57 00 

CAAGCGTTAT GCGT t AACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 5760 

75 AGGTATT CAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5 94 0 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6 000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6 060 

25 TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

AC CTTTCCT A CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 63 00 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 63 60 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 64 20 

35 

TATTAATTCT TATGGCGTTC ATT CT ATT CA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 64 80 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 6540 

TATAGGTATG GGG CATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 67 80 

AACAAAACTA ATTGAAGATG cAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

AC CATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 7080 

55 
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TGTATTTATA ACAG AT C CAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 7200 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7260 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 7320 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 73 8 0 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TT AG C CAAGT 744 0 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7 50 0 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7 560 

15 TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 768 0 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 774 0 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 777 8 
(2) INFORMATION FOR SEQ ID NO: 49: 



20 



25 



30 



35 



40 



SO 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 




AGATGAAGTT 


GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA 


TAGAAACAGG 


-TAATGGATTG- 


-TTTGAAAGAT- 


eACATAGTGG- 


"TTGTGCGACG" 


"GGCGGATCCT" 


GTAATTGTTC 


ATTATAAAAA 


ACATCGAGTC 


AGAAAAAGGT 


GGTTATTGAA 


cCACTAACTA 


GCATCTGACT 


CGATGTTTTT 


ATTTATTCGG 


GATTGTTTGT 


TTGAATTGTT 


GTG CTAAATC 


TGGTCGATCT 


GTCACAAT CG 


TGTGTGCACC 


TTTTTGGTAT 


AAATCATTCA 


TCAGATTTAT 


ACTATTTACG 


CCATAATAGC 


CTGGAATGAT 


ATTCATATCA 


TTTAACCATT 


TGATAAAACG 


AGATGAAGTC 


AAATCAATGC 


CTTTAAAATG 


AGTAGGCATT 


TGGAACGTTT 


GTGCTAATGG 


TTGGTAGTAC 


CTACCACCTA 


ATAAATGATA 


TTTTAAAAAT 


GCTTCTGTAA 


CTTCCTGTTG 


GCTAGCACCA 


ATTGCGACGG 


ATCCTTGTGC 


AATTTTATTA 


AAACGAACGA 


TTTGTTCTTT 


ATAAAAACTT 


GTCACAAGAA 


CGCGGTCAAA 


TGCTTGATTT 


TCTGCAATTG 


TATCAAACAT 


AATTTGTGGT 


GCGATTGAGC 


CTTCATAGGA 


TTCAGGAGCA 


TCTTTTAAGT 


CTACGTTTAT 


ATACATATCA 


GGATATTGCT 


TCAGCAACTc 


ATCGAAGGTT 


AGTATAGCTG 


TGTGTGCATG 


ACCACGATAT 



60 
120 
180 
240 
300 
360 

45 AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTGCTAATGG TTGGTAGTAC 420 

480 



540 
600 
660 
720 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA 840 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 900 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 960 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 1020 

GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO: 50: 

'5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: SO: 

CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 60 

GATCATC CTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 120 

GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 180 

AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 24 0 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 300 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 360 

TTAATTTTAT CTTGTTGCTT TTTATTAACA T CACCGG CAT ATTTTGTTGG CACGTCGACA 420 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 480 

ATTGTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 540 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 600 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 660 

45 TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 720 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 840 

50 CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 900 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 1020 

55 



25 



30 



35 



40 
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TCTATCAATA ATG CAT CATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAACTGC 1140 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAAT AATTTCGTTC i 2 00 

ACATACTTTT CTTTCTCAAT ATCATTTTTC ATATTGATTT GTTTGCGAGA GGTACATACT i 260 

TTAAGCATTA TCGCACATCT CGTTGTATAT ATTAAGTTTA TCATAACATG ATTTTATGTC 132 0 

10 GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 13 8 0 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 144 0 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 1500 

'5 ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAG AT AC C ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 16 80 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 174 0 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

2s TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT I860 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GG CAGAATG C AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 1980 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 204 0 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

35 

TC^TACC^G-mGTATCATT-AACI^ CAGCAACTAT ACCAGTACGT 2220 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

40 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 24 00 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2460 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2520 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2640 

50 

ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

ss CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 
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10 



is 



20 



25 



30 



35 



40 



45 



50 



55 



TGTTGTTTTA AATCAGCATT AAGCATGGTT GTAATGCCTC CTTAGATTTT ACCTACTAAA 2940 

TCTAAACCAG GTTGCAATGT TTTAGCGCCT TCTTCCCATT TAGCTGGGCA TACTTCGCCA 3 000 

GGGTTTTTAC GAACATATTG AGCTGCTTTG ATTTTGTGAG CTAATGTACT AGCGTCACGG 3060 

CCAATTCCGT CAGCGTTAAT TTCAGATGCT TGTACAACAC CGTCTGGGTC GATAATGAAT 3120 

GTACCACGTT GAGCTAAACC AGTAGCTTCA TCTAATACAT CAAAATTACG AGTGATTGTT 3180 

TGTGATGGGT CACCAATCAT AGTGTAAGTG ATTTTGCTAA TTGCATCTGA ATGGTCATGC 324 0 

CATG CTTTGT GTACGAAGTG AGTATCAGTT GATACTGAGA ATACATTTAC GCCTAATTTT 3 3 00 

TGTAATTCTT CAT ATTGG TT TTGTAAGTCT TCTAATTCAG TTGGACAAAC GAATGAGAAG 3360 

TCAGCAGGAT AGAAG CAT AC TACGCTCCAA GAACCTTTTA AATCTTCTTG TGTAACTTCT 34 20 

TTAAATTGAT CTTTTTTTGG ATCGAAArCT TGCGCTGTAA ATGGTAAGAT TTCTTTGTTA 34 80 

ATTAATGACA TAAATATCTT CCTCCTAAGA ATTTAAGTAT GAATTAGAAC TATCAATTGA 3 54 0 

TTGCGCTTAA TTATAATAAT TCTAATCTCT TAGTTAGCAT TATTACATTT TGATCCAGAA 3600 

TAGTCAACTG GATAACTTTG TAAAGTGAAT GATTACTTTT AAAATAAAGA AAGATAATAT 3660 

AAAGTGCTTT GATAATGGAT TTTGTAGTTG ATGATTTAAA AGGTTGTGTC TATATTTAAT 3720 

ATCTTGATTT TAATGTAAAA AATGTAAAAA AAGAAGATTT GT ATT CT C AA CTAAGTCAAC 37 80 

CTTATTGATA ATGGTATGAG AATATTTGTT CGAGATGGAT GAAGGTAATG AGTGAGAAAC 3840 

TGGATTTTTA AAGTATGAGA CAATATTTTA AAAAGTTCAA TTATTAACTT ATAAGCAAAT 3 900 

AATTGCTATA AAAAAGTTTG GACGTGTACA ATTGCAATAT GAAGATTTTA AATTAATTGT 3960 

AAAGTATCGA GGAGTGGGTA ACGTGTCAGA ACATGTATAT AATCTTGTGA AAAAGCATCA 4020 

TTCTGTTAGA AAATTTAAGA ATAAACCTTT AAGTGAAGAC GTTGTTAAGA AATTGGTAGA 4080 

AGCTDGACAA AGCGCTTCGA CGTCAAGTTT CCTGCAAGCA TACTCAATTA TTGGTATCGA 4140 

CGATGAGAAG ATTAAAGAAA ATTTACGAGA AGTTTCTGGA CAACCTTATG TTGTAGAAAA 4200 

TGGCTATTTA TTCGTCTTTG TTATTGATTA TTATCGTCAT CATTTAGTTG ATCAACATGC 4260 

TGAAACTGAT ATGGAAAATG CATATGGTTC AACGGAAGGT TTGCTAGTAG GTG CAATCGA 4320 

TGCAGCATTA GTTGCCGAAA AT ATTG CGGT AACTGCTGAA GATATGGGGT ATGGCATTGT 4 3 80 

CTTTTTAGGA TCATTAAGAA ATGATGTTGA ACGCGTTCGA GAAATTTTAG ACTTACCTGA 444 0 

CTATGTCTTC CCGGTATTTG GTATGGCAGT AGGGGAACCC GCAGATGACG AAAATGGTGC 4500 

AGCCAAGCCA CGCTTACCAT TTGACCATGT CTTCCATCAT AATAAGTATC ATGCTGATAA 4560 

GGAAACACAG TATGCACAAA TGGCAGATTA CGACCAGACA AT CAGCG AGT ACTATGATCA 4620 
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CAAAGCAAGA 


TTAGATATGT 


TAGAACAATT 


GCAAAAATCA 


GGCTTAATAC 


AGCGATAgCA 


4740 




AGATACCAAA 


ATAACCCGCC 


CCCCTCTAGC 


TTAAAATGAT 


AAGTATAGCT 


AGAGGGGGCG 


4800 


5 


GGTATTTCTT 


GCAATGAATT 


AGTGTGAAGT 


TAATGCAGCA 


TTATCATTTG 


AATCGAAAGT 


4860 




ATCTTTATCC 


CAATGTTTAG 


TTAACTTGGC 


GGTACCTGTA 


CCAGCTAGCA 


TTGAATCGTT 


4920 


10 


CACGTTTAAT 


GCTGTTCTAC 


CCATGTCAAT 


CAATGGTTCA 


ACGGAGATGA 


GCACGCCGGc 


4980 




TAAAGCGACT 


GGCAAGTTTA 


ACGTTGACAA 


CACCAATATG 


GATGCAAATG 


TAGCCCCGCC 


5040 




ACCGACGCCA 


GCAACGCCGA 


ATGAACTAAT 


AATCACGACA 


GCGATTAACG 


TTACAATAAA 


5100 


IS 


TTGTAAATCA ATTTCTACAT 


TAGCGACGGG 


TGCGACCATA 


ATTGCAAGCA 


TGGCAGGGTA 


5160 






ill 


GTCCAATCGA 


CAATCCAAAT 


GTCGCAGCGA 


AATTGGCAAT 


5220 




A /^ r P*P^ ,, ITS/I f 


A^tjL-L. I AQjAC 


GTCTTGTTTG 


TGTTTGTACA 


TTCAATGGTA 


AGGCACCCGC 


5280 


20 


GCTTGAGCGT 


GATGTGAATG 


CAAAGATTAA 


TACTTCCAAA 


GTCTTTTTAA 


CATAGCGAAT 


5340 




TGGGCTAATA 


CCTAACAGGC 


TTAAAATAAT 


TAAGTGAATG 


ATATACATCG 


TAATTAATGC 


5400 


25 


AGCGTACGAT 


GCGATTAAGA 


ATTTTC CTAA 


AGTCCAAATG 


GCGCCAAAGT 


CACTTGTCGA 


5460 


TAATGTGTTG 


GCCATAATTG 


CTAATACACC 


GTATGGCGTT 


AAACGTAAGA 


CGAACGTCAC 


5520 




AATCGCCATT 


ACTAGTGAAT 


AGATAGCGTC 


AATCGCACGC 


TTAAGCAATT 


CACCATGATC 


5580 


30 


AGGTTGTTTG 


CGTnTACGCG 


TAAATAAGCA 


AATCCTATAA 


ACGAAGCAAA 


TATCACGACA 


5640 




GCAATCGTGG 


aAGTTGCACG 


TTGTCCaGTG 


AAATCTAAGA 


ATGGATTTTT 


AGG CAATAAT 


5700 




TCCAAAATTT 


GTTGTGGTAA 


CGTATGTGCT 


GTTAAATCTT 


TCGCTTGTTT 


AGCAATTTCG 


5760 


35 


CTTCCACGTG 


CTTGTT CAG C 


GTTACCAAGG 


_TTAATTGTTG- 


-ATGCATCTAA- 


ACCAAACACC 


5820 




AAGGCATACA 


CAACACCAAC 


AATCGCAGCA 


ATGGTGACAG 


TGCCAATTAA 


AAAGATAAAA 


5880 


40 


ATGAGACTAC 


CAATTTTAGC 


AAACTTTTCT 


CCGATTTGAA 


TTTTAGTGAA 


TGCAGCTACA 


5940 


ATAGAAATGA AAATTAAAGG 


CATAACAATC 


ATTTGCAACA 


ATG CAACGTA 


ACCTTGTCCG 


6000 




ACAATGTTGA ACCAGTCACT 


TGTTGATGTA 


ATAACATTCG 


AATGTGTGCC 


ATAAATAAGA 


6060 


45 


TGCAATAACA 


CACCGAATAC 


TATACCAATC 


CCTAAAGCTG 


TAAACACACG 


TTTCGCAAAA 


6120 




GATATATGTT 


TGCGAGCCAT 


CATGTG CAAT 


ATTACGATGA 


AAATCACCAA 


TACAATAATA 


6180 




TTAATCAGTG 


TAAGAAAAGC 


ATTCATGAAC 


GTCACTCCTT 


AAATTTTTGA 


ATATAATTCC 


6240 


50 


GACTAGTATG 


CT 










6252 



(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6730 base pairs 
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(C) STRAND EDN ESS : double 

(D) TOPOLOGY: linear 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTg AG AA ACAAAGTAAT 6 0 

GCTCTTTTTT TACCTCTTGT GGG TTGAAAA a TGGATCATC AGAGATAGAC TTCTTCTTTT 12 0 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 18 0 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 24 0 

ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 3 00 

TTGG CGTT AT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 360 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GG AATGATTT TTTAATTTCT 420 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 4 80 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 54 0 

ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 600 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 660 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 720 

ATTTTAAATG AAAAATATGG ATTAAATGAT CCTG t AG CTA CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCi\ ACCTGTGTGG 84 0 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 900 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 1020 

GTACfTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGCAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 1200 

AGAGCTAAAG GTAATTCGAC AATG CGTGT A CTTTTTGGAC ATG CACTTAG AAATGCTTTA 1260 

ATTCCAATTA TTACAATTAT CGTTCCCATG TT AG CAAGT A TTTTAACAGG CACTTTAACA 1320 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 13 80 

AATGATTT CT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 1440 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 1560 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 1680 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 174 0 

5 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 1800 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 18 60 

w AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG 192 0 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT C AAATTT CAT 1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTG CG A 2 04 0 

15 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 2220 

20 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTGTCATGGC TTCGAAAACA TTGGGGGCTT 2280 

CAAAATTCAA ATTGATATTT AAGCATATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 2 34 0 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 24 00 

25 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 24 60 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2 520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA 2580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 2 64 0 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 2700 

35 GAAACATTGG~CAATTGTTGG~TGAATCAGGT~T 2760" 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2820 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2880 

40 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2940 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3000 

4S GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATC CTCAT 3060 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3240 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3360 
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GG AGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 BO 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 354 0 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 3720 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 3780 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3 84 0 

15 TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3900 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3 960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4020 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4 080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 414 0 

GCCAACGTAT TGGaATTGCC CGTGCATTAG CCGTTGaACC AGAATTCATT ATCGCGGACG 4 200 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4 260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 4320 

30 ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 4380 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 4440 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4500 

ATAATCATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 4560 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4620 

ATG<!AATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 4680 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 

TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4 800 

TGACTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4 860 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4 980 

50 GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 

TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT TCCATATATT AATCAATTAT 5160 
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ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 52 80 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5340 

TAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTG CATCA TTGTATGATA 54 00 

CTGAATCAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAATAAATAT AAAGACAACA 54 60 

AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 55 80 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 564 0 

15 AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 58 80 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 594 0 

TGCAGATTAC CCTGATC CTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA G CAG AAG AAA TGTTCCTAGG 6120 

30 AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATT t AC CAT A AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 624 0 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 63 00 

~GGAGACATAT"CTCCAGTCTT-TTTGTGTTGG- 6360" 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 6420 

TAA/CtTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 64 80 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6540 

TGTTGAGTGC TTGCGGAAAA AG CAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6600 

CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6720 

ACGAAAGCTT 6730 
60 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6482 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 52: 

s AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 60 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 180 

10 TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 24 0 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 3 00 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATTCGATCCA CAAAAGAAAG 3 60 

75 TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 4 20 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 4 80 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATA GAG C GAG CCATTTTGTG CTAATTTTTG 54 0 

20 

CGCGTAAAAA TGTAACGTCA AG AT CACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 6 00 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATG CATTC CAAGCAGATT 6 60 

2S TCCATATTTC TGATAATGAT CAAGCCTTGT ATG ACTGGT C AAGTAAACAA ACGTATATTG 720 

CATTAGGCAA TATGATGACG ACAGC CGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 780 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAG CAAAT AA AGGGATCTTA GATACTGAGC 840 

30 AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 960 

ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 1020 

35 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGGfiACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATC CATAA CTTTGTGAAT 1200 

40 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 1260 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 1320 

45 TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 1380 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 144 0 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

SO TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 1560 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 1620 
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aAGTCTGACG 


GcCGTCTTCT 


AATAAATGTA 


ACGTTAGAGT 


ATGGcCACCA 


GTCCCAACAG 


1740 




ATAATACGGT 


TGTATTATCG 


TCAGAACTTT 


TAACGGATAG 


TCCTAAAATG 


TTTTTGTAAA 


1800 


5 


ATGTTGTCAT 


TAAGTCTAAG 


TCTCTTACGT 


TCAGTACAAT 


GTTTGTCACT 


TGTGTTGCTG 


1860 




TTTTATCGTG 


AAATG CCATT 


ATGCATCGCC 


TCTTTTTCTA 


TTTTTCTATA 


AGTTAGTATA 


1920 


10 


AAAAGTATAC 


CAGAAAAGAA 


AATGAATTGA 


TAG CAT AAAG 


TTTGAAATGC 


AAAATAACTA 


1980 




GTCGTTTTGC 


AATTTT A C AT 


TGATGCGAAC 


AAAAAAGCGA 


TGGTACAGTT 


GCACCATCGC 


2040 




AAAATTTATT 


TAACCAAGAT 


ATACATCTTG 


ATATGAATCT 


TCTTTTTCTA 


ACATATGTTT 


2100 


15 


GGCAAATGAA 


CATGAGGCAA 


TAATTTTCAA 


ATTATTTTCT 


CGAGCGTGTT 


CAACAACTGc 


2160 




TTTAAGTAGT 


TTTTTGCCAA 


CACCTTGACC 


ACCAAGTTCA 


TCAGATACGC 


CTGTATGATC 


2220 




AATGTTAATT 


TCATTATTAT 


CCACAAAACG 


GTATGTGATT 


TCAGCTAAAG 


CATTATTTTC 


2280 


20 


ATCATCACCA 


ATATAGAATT 


TGTTCTCGCC 


TTGTTTGATT 


TCAAGGTTAC 


T CAT ACATAT 


2340 




CAACTCCTAT 


CATGATTGAT 


TATAGTATTT 


CCCTATTCTA 


TTTTAACTTA 


AACGAAGTCA 


2400 


25 


AAGGTGCATG 


ACAGTCATGT 


GACGACATTG 


CCACATCTAT 


GTAGTCGTTT 


TTATTAAGCA 


2460 


CAGTTTGAAA 


TGAAGATGAA 


AACACGTATC 


TTGACATTAA 


ATCTATTCAG 


CTATATAATT 


2520 




TATCTCGAAA 


TCGAAATAAA 


ATAAAAAAGT 


TGGTGATCAT 


ATGGATCGAA 


CGAAACAATC 


2580 


30 


TCTCAATGTT 


TTTGTCGGAA 


TGAATAGGGC 


GTTAGACACA 


TTAGAGCAAA 


TTACAAAAGA 


264 0 




AGACGTAAAG 


CGATATGGCT 


TAAATATTAC 


TGAATTTGCA 


GTGCTCGAGT 


TGCTTTATAA 


2700 




TAAAGGTCCG 


CAACCAATTC 


AACGTATTAG 


AGACCGCGTA 


TTAATTGCAA 


GTAGCAGCAT 


2760 


35 


TTCATATGTT 


GTAAGTCAAT 


TAGAGGACAA 


AGGTTGGATT 


ACACGTGAAA 


AGGATAAAGA 


2820 




TGATAAACGT 


GTATATATGG 


CTTGTTTAAC 


TGAAAAAGGT 


CAAAGTCAAA 


TGGCAGATAT 


2880 




TTTCCCTAAG 


CATGCTGAGA 


CATTAACAAA 


AGCGTTTGAT 


GTGTTAACAA 


AGGATGAATT 


2940 


40 


AACAATCTTA 


CAACAAGCGT 


TTAAGAAACT 


AAGTGCACAA 


TCTACAGAAG 


TGTAAGGCGT 


3000 




GCACTAAAAA 


TTTACATTAA 


AGTATCTCGA 


TTTCGAGATA 


AATGCACTAA 


AAATATAAAG 


3060 


45 


AGGGTATATA AAATGATAAA 


TAATCATGAA 


TTACTAGGTA 


TTCACCATGT 


TACTGCAATG 


3120 


ACAGATGATG 


CAGAACGTAA 


TTATAAATTT 


TTTACAGAAG 


TACTAGGCAT 


GCGTTTAGTT 


3180 




AAAAAGACAG 


TCAATCAAGA 


TGATATTTAT 


ACGTATCATA 


CTTTTTTTGC 


AGATGATGTA 


3240 


SO 


GGTTCGGCAG 


GTACAGACAT 


GACGTTCTTT 


GATTTTCCAA 


ATATTACAAA 


AGGGCAGGCA 


3300 




GG AACAAATT 


CCATTACAAG 


ACCGTCTTTT 


AGAGTGCCTA 


ACGATGACGC 


ATTAACATAT 


3360 




TATGAACAGC 


GCTTTGATGA 


GTTTGGTGTT 


AAACACGAAG 


GTATTCAAGA 


ATTATTTGGT 


3420 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 354 0 

GCGATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3600 

ATTTTAGAGA CTGTTTACGG TATGACAACT ATTGCGGATG AAGATAATGT CGCATTACTT 3660 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3720 

GCaGCACGTC AAGGTTATGG t GAGGT ACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 3780 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACTC AGG CATCGTT 3 84 0 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3 900 

ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGG CGA AGGGTTATCC 3 960 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG ACCTTTTAAT 4 020 

ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 4080 

GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GATTTATTAC CGTTAGGCGA AgcATTGAAT G AAAATT AT C ACTTGTTAAG TATTAGAGGA 4200 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 4 26 0 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4 320 

CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT TTTCAAATGG ATCAAATATA 4 3 80 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAGCATTGTT ATATGCACCG 444 0 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTG CAAGTG AACAAGTCAT TAACTTGTTT 4 560 

AATACACGTG GGGCACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 4620 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 4680 

TGGAAAAGAT TTTTACTTTT CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 4740 

TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACT 4860 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGCCATTGTA 4920 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACG C AAATG CAT AC 4980 

ATTAATCCAA CAACAGTATT GTAGATGACA AGTATCATAA TGACAGACAT AATAATACCA 5040 

ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 5220 
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CCAGGTGATA ATGATTTCTG CTTATGAATC TGAGCATCAT TATTAGCGGC AGTAAAATCA 5340 

AGATGACTTG TTGTGAAATA GTAGACCGCA ATCATAATGA CAATCGCAAT TAAAAATGGG 5400 

GTAACACCGC CAAGCACAGC AATTAAACGA TCGAATTTTA GAAACAGTGT TGCTAAAATA S460 

AAGGCGACTA ATATGAGTGC GCTCAGCCAA TACGGTAAGT TGAAACTTTG ATGAATGGTT 5520 

GACGCACCAC CTGCAGTCAT AATAATAGCT AAAGACAACA TAAACATTGT TAAAATAATA 5580 

TCAAAACCTC TTGCAATAGA GGGGTATAAG AAATAGTTAA TTGAATCAGA ATGATTTCTG 564 0 

GACTTTAGAT GATGACCTGT ATGCATGACA ACCATTCCAC CTAAAGTAAT CAATAGTCCT 5700 

GTTACAATAA TGCCTGAAAT GCTATATGCG CCATGACTTG TGAAAAACTG GAAAATTTCT 5760 

TGACCAGTAG CAAAGCCGGC ACCAACGACA ACACCAACAA AGGCAAATGC CACAATAATG 5820 

GACTCTTTTA AGATACGCAT GATTTAAAAA TGTCCCTTCG TAATTTTAAG TAATATAGAA 5880 

AATGTAACAT ACATGTTAAT GAAAAATATA GTACTAATAT AGTATTTTGT TAAATTGGAG 5940 

TAGAAGCGAG GGTGTCGGTC ATTTCATTAA TTTATTAGTT GATTTTGCAT TTTTTTGCTG 6000 

TAAAGTTGTT ATAATACAGT TAACAGGAAT TAGCATAGAT ACACCAATCC CCTCACTACT 6060 

CGCAATAGTG AGGGGATTTT TTTCGGTGTA GCTAGGTCGC CTATTTATCA TCGTGTTTGC 6120 

GTAgCaATGC GTAAACACAG TACCACTAAA TAAGTGCACG ATACATGCAT CAAATGTCGT 6180 

CTTTAGTCTA AGTAACGATC ATGCATTAAC ATTTTCAAAA TATCTATTTG AGCTTGAAGA 624 0 

TCTTTACCAA TATTGGTATC ACGAATCTTC TTACGTTGTA ATTCTTTATC TACGACGCGC 6300 

TTTATAGAAA GTTCATCGAT ACCTTCGGAA AGTATTTTTn CTTTAGCGTT AAATTGTTGG 6360 

3 5 TGTGCAACGA GTTGCATACC GAATGAATTA TACAATAGTG TATAGCCTGC AATGCCAGTn 6420 

GTTGACTGAT AAGCTTTTGA AAAGCCACCA TCAATGACAA GCATCTTTCC ATCAGCCTTG 6480 

AT - 6482 

40 (2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

{A> LENGTH: 16592 base pairs 
(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



25 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

50 

ATTTAAGGCG ATTG CTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 60 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 
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AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 24 0 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 3 00 

5 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 3 60 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 42 0 

ATCAGAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 4 80 

W 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 54 0 

GACATkwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 60 0 

75 GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 660 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 72 0 

TTAAAAGGTA AT CG ACT ATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 78 0 

20 GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTAT CGAA 84 0 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 90 0 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 96 0 

25 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 102 0 

TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 108 0 

GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTG C ATT A 114 0 

30 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 120 0 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

55 CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 1320 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 1440 

40 CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 1560 

CtAAATGcTA AGaAAAATAC ACAAGGTATC g AAACAG r AC CTG t CCmATT GtCTTACTCm 1620 

45 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 1680 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 1740 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 1800 

50 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 1920 

55 



404 

tNSDOaO: <eP_0786510A2_l_> 



EP 0 786 519 A2 



10 



15 



25 



TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 204 0 

GGTGTATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 2160 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 2220 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATGCACGC GCGCGTAAAA 2 2 80 

TGACACTCGC TGCAAATGAT GTGATTAATG GTGCTACACG ACTTGTAGAT ATCGCTAAAA 234 0 

AATATCACTA TGCAAATTCA AATGATTTTG CAAATGATTT TAGTGATTTT CACGG CGT AT 24 00 

CACCTATT CA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 2460 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 2520 

CATTGGTTGG ATATGCACGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 264 0 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2 700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2 760 

AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 282 0 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGCCATATGA ACGAAATGAT TTATATGTTG 2 880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 2 94 0 

CTGTTAAACA GAGTCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 30 00 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GG CAT ATT AA TATGTTAATT 3 060 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGTTTT AAATTAACTT TGGCATCATA 3120 

35 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTAGTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 324 0 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 3300 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 3 360 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 3420 

AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 34 80 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 3540 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 3600 
ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 3660 
ATGACTTGTT GCTGATTCAT ACCAATACTT GTCATCGTCA CCCCACGTCG ATACACGTCG 3720 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 384 0 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3900 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3 960 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4020 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4080 

CGTCTTCTTA ATATGCGTAA TTG CATCT AC TCTAAATCCA TCAATGCCTT TATCAAACCA 414 0 

CCAGTTCATC ATTT CAAAT A CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 42 0 0 

TTG TTTTTT A CTGAATAAAT GGAAATAATA TTG CTCAGTA TTAGCATCAT ATTCCCATGT 4260 

AGATCCATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 43 8 0 

20 ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 444 0 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4 50 0 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4 560 

25 AATGACATCG ATACCGAAAT CTTTTAAGTA GTC CAATTTA TCAATCATTC CAGGTAAATC 4620 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TAT AC TTGAT ATGCTACTGC 4680 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TAT CGCTGTG TTGATTTTCT 4 74 0 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4 800 

TAT ATTT AAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4 860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 4920 

ATCTTTCGGA ATTTCAATAT T AAGTT CAT A TAGGACACTT AAAATCG CTA AATGTAACAT 4 980 

AG CAT CT AAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 5040 

40 TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 5100 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TT AATT CAT C TACACCTTGT TCAATAACAT GTCGTGTCAA 5220 

45 ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA T AAATGG CAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

ATCATTTTCT TTTGAATAGA G CAGTAT AAA CGCATCAACC ATTCGTTGTT TAATCATTTT 5400 

SO 

ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 
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AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 5640 

AATTGTCGCT TCGCTAATAC GTTGATTTCC ^ TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 5760 

CTGCATCGGA AAACGCTTCC AACCACTGTA TAATACCAGT TTAGTCACAC TTTCTAAAAA 582 0 

AGTCAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA 58 80 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 594 0 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT 6000 

CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6060 

TTTTCTCTTT AAACATTAAA GATGGTGTTC CTAGGTTCAC TTCCGGGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTTCTG 624 0 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 6 3 00 

CCATAAAATA GTACCTCTAT TTCT CT AT AG TACATGCTAT CATAACACAG TAAATATTTT 6360 

ATTACTTCAC AAAATGCTTA AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 6420 

GTGTAATTTT TTATAGGTTG TAGCTACTCT ATCACACTCT CTTTTATATT TATCAAAAGA 6480 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA T ATT ATT C TT CAATCCATTG 654 0 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 6600 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACGG T AACT AT CAT AGTAATTAAT 6 660 

ACTTGATGAG AAACCA G GTG TTGGTACACC ATTTTGAACA CCAGTTGCGA CAACAT CACG 6720 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 6780 

TAATCCTGGA TTATTATCGT AAGCATCTTT GATCTTTTGT AAGAATTGTG CACGGATAAT 6840 

GCAACCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6900 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6960 

TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TGAAATGATG CTTTTGGACC 7020 

ATTTAATTCT TT AG AAG CAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7080 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 7140 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT TCAACTAATG CTTCTTTATT 7200 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 7260 

AGCATTCCAG TCTTTGAACG TTTGAGCAAT GTCTTCATGA G ACATG C CT A ATAATTCTTT 7320 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 7440 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 7500 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTC CAATT TCACCACCAG AAACGCCCAT 7560 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TG CTTTATT A CGTCTGATAG TATCTTGATA 76 20 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 76 8 0 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7 8 00 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGCT AAACCAATAA CTCCAATTTG 7920 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 79 80 

20 CATACACTAC ACTAAATCAT TTCGAATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 8 04 0 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

25 AACCGCCACC AGAAATAATT GTATTTG CAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 8220 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 82 SO 

T AA CT AC TG C TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA GTCATATGTT 834 0 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 84 00 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 8460 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 8520 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATTTGCGT CGGTGCACCT AGAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 864 0 

AACCAAAGTC CGCGTCCAAC AAC TCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 8700 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8760 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 8820 

45 TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA 88 80 

CAACAGTATC CATATGGCTC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 8940 

TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 9000 

50 

CTTTAACATC TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 9060 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 9120 
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GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 9240 

T CAT ATT ATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 93 00 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 9360 

CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA 9420 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 94 80 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 954 0 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACT AT AT C TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA AACATGGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 97 80 

ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 984 0 

TTCAAACGCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCAT AT CA 10020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

ATAGACT t CG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 10140 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 102 00 

TTACATTTGT ATT CG TC t AG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT 10320 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 103 80 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

AT ACC CACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10860 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 
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AGAATTGATC 
AGTAGTTAAT 
GTTCATGTAT 
ACTA t CCTC A 
ATATTTAGAC 
AATTTGTTCA 
GCAAAAATGC 
TATTTTAGTG 
TAGATGCATC 
CGCTGTTTAA 
ATTGAAATAC 
TTTTTAACGA 
GATGATACTG 
TCAG CTGTCA 
CCTTTGATTG 
GCAACAGCAA 
CTATTTAATA 
AACGTTAGAT 
ACAAGATTTG 
TTAACCATAT 
CTATTGTCTT 
TGTTTGTCAG 
GTTACACGAC 
GCGAGTTTAA 
GTAGATGTCT 
ACTTTTGCTT 
CAGATAATTG 
CCTGATATCG 
TCATATTCAT 



ATAACTAGTG 
TGCTCATATC 
AAATCGAAAT 
TTCTACTAAT 
ACAATTTTAA 
CATGTTTTCA 
ATTCAACCAT 
CCAAAAAATA 
TATGTTATCA 
TATGATTCAT 
ATAAATTAAC 
TTGATTCTAC 
AACCAAATGT 
ATTGCTTATT 
ACTTTTCGTC 
TTGAAATATT 
AAGGATATGC 
TATATCCTTC 
TAGCATCTAC 
TTTGCGCAAT 
CAGATGATTG 
ATTGAGCTGT 
CTTCAAATCC 
ATACAACAGG 
GTTCCACTGT 
GTATCTCTTC 
TATCAATAGC 
TGGAAGGGAC 
CAATATGATC 



TTGTACCATC 
CCGCAGATTC 
TCATGnCCAT 
TAATAACATA 
CAATATACCA 
TTAATATGTT 
GTTGATTATT 
ATACATCCAT 
CTAATATATA 
ArATTTACCT 
CATGTTACGA 
TTGTAAAATC 
ACCAGTATTA 
ACGCGCTTTC 
TGCATGCTTA 
AATGTCTTTA 
TTTTAAAGCA 
TTTATTTTTA 
TTCAATCATC 
TGCTTTACGC 
GTTACTTGAT 
GGTACCACCA 
ACTACCTACA 
TGAAAAGCGA 
TGCACTAGCT 
AGTTGTTTCA 
TACTGTCTGC 
TTCAGCTGTC 
ACCAACAGAA 



TTGTTTAAGA 
AATTTCATTC 
AAGTTCAATC 
TTGTTCAATA 
AACATTATTG 
T CAAGTATG A 
GTTCTTTATC 
CGACAAGAAC 
TTTGTATTTT 
GTTTGTAAAC 
ATTG CAATCA 
GCTGCTTGTG 
TTTACCGTAA 
GTTGCTAAAG 
ATCACAGGTA 
TGTAAGACAA 
TCTGCTACAG 
AAGCTGTTTT 
ATCCATGCAT 
ACACCATTTA 
GTATCTACTG 
TTTTCAATAA 
ACTTGTGATA 
CCATTATTAC 
TTTTTAGTAG 
TTTGTCTTTT 
CCCGCTTCAA 
ACTTTATCTG 
ACTAACCATT 



ACTTTGTCAA 
CTTGCTTGTT 
ACCTATCCCT 
AACTAATCTG 
TGCTTAAAAT 
TGTCTTATTT 
TTTTTTGAAT 
AAGATAAAAC 
CTAAAGTATA 
CATCTAAAAT 
TATCATTAAT 
GATGATTTAT 
ATGTACCGCC 
T ATT AATTT C 
CGTATAATTT 
TTTCATTTCC 
CTTTTACAAA 
TATAATGATT 
GTGGAATCTC 
CTGGTATTGT 
ATGTTGATTT 
CTGACATTAT 
AATCAATGTC 
GTGGTTGATT 
ATTTCTGAGT 
CATCAGCAGT 
CTAAAATTTC 
TAATAACTTC 
GTTCAATGGT 



CATCTTCTGC 
CTACAACACC 
TTATATTTAA 
AATCACACCT 
CATGGTAACT 
TGACTTTACT 
ATATTGCACA 
AAGTTGTCGA 
CTGTTCGATA 
ACGATGATCA 
TACTACTGG C 
AAT AC CCATT 
CTGCATATCT 
TCTAG CTATA 
ATTTTCATCA 
TTGCCAGCTA 
GAAAGCAAAG 
TCTCGTATTC 
TGTTACACTA 
GCTGTTTTCA 
TGTTTGAACT 
ATCCTTCTTA 
ATGCTCTGAA 
TTGTTTAGCA 
ATGCTCATCC 
TTCAATTTTA 
TGTAATTGTT 
ACATAATGGT 
GCCTTCATGA 



11040 

11100 

11160 

11220 

11280 

1X340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 
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AATTCACGCA 


TTTTATTTAA 


GATTTTTTCT 


GGATTCATCA 


TAATTTCATT 


TTCTAATACA 


12840 


5 


GGAGAAAATG 


GCATAGATGG 


TACAtCTGGA 


GCAGCTAAAC 


GCATGATTGG 


TGCATCTAAA 


12900 




TCGAACAAGC 


AATGCTCTGC 


AATAATCGCT 


GACACTTCTG 


ACATAATACT 


ACCTTCTAAA 


12960 




TTATCTTCAG 


TTACAAGTAA 


AACTTTACCT 


GTATGTTTAG 


CACGATCAAT 


AATTGTTTCT 


13020 


10 


TTATCTAATG 


GATAAACAGT 


TCGTAAATCA 


ACGACTTCAA 


C ATTG AT AC C 


GTCTGCAGCT 


13080 




AAAATATCCG 


CTGCTTGTAA 


ACAATAATTG 


ACCATTAATC 


CATAACAAAA 


TACTGTTAAA 


13140 




TCTTCACCTT 


CACGTTTCAC 


ATCTGCTTTr 


CCTAAAGGTA 


CAGTGTAATA 


TTCTTCTGGC 


13200 


IS 


ACTTCTTCCT 


TTAAGAAACG 


ATAAGCTTTT 


TTATGCTCAA 


AGTACAATAC 


TGGATCATTT 


13260 




GATTCGATAG 


ATGATAATAA 


AAGCCCTTTA 


GCATCATACG 


GTGTGGAAGG 


AATAACAATT 


13320 




GTTAAACCTG 


GCGATGAAGC 


AAATATACTT 


TCAATACTTT 


GTGAATGATA 


TAGTCCTCCG 


13380 


20 


TGAACACCGc 


CACCAAATGG 


TGCACGAATC 


GTTAATGGGC 


ATTGCCAATC 


ATTATTTGAA 


13440 




CGATAACGCA 


TTTTCGCAGC 


TTCACTAATA 


ATTTGATTTG 


TCGCAGGTAA 


AATAAAATCT 


13500 




GCAAATTGAA 


TTTCTGCAAT 


TGGTCTTTTA 


CCTACCATAG 


CTGCACCAAT 


GGCAGTTCCA 


13560 


25 


ACAATATTTG 


ACTCAGCTAA 


TGGCGTATCG 


ATAACTCTGT 


CTTCAC CAT A 


TTTTTGTTGC 


13620 




AGTCCTTGAG 


TAGTACCAAA 


TACGCCACCT 


TTTCTACCAA 


CATCTTCACC 


AAGAATAAAC 


13680 


30 


ACATCTTTAT 


TTTGTTGTAA 


TGCTAAGTCT 


TGTGCCtGcG 


TATCGCCTCT 


AAATAAGATA 


13740 


ATTTAGCCAT 


TAGTTAAGAC 


TCCCTTCTTC 


GTACACAAAT 


GCATAGGCTT 


CTTCGACACT 


13800 




TGGATATGGC 


GCGTCTTCAG 


CAGCCTTTGT 


CGCTTTATTG 


ATGATGTCTT 


TnATgTCCGC 


13860 


35 


TTCTATTTCT 


GCCAACCAAG 


CATCATCGAT 


AATGCCAGCT 


GAAAGCAACT 


CTTTTTTGAA 


13920 




CTTTTCATTG 


CAGTCTGCTT 


TTTTAAGcGT 


TTCACGCTCT 


TCTTTCGTAC 


GATATTGGTC 


13980 




GTCJCTCATCT 


GATGAATGAG 


CTGTCATACG 


ACTTGTTACT 


GCTTCAATCA 


AAGTTGAACC 


14040 


40 


TTGACCAGAA ATAGCTCGAT 


CTCTTGCTTC 


TTTCATCGCT 


TTATACATTG 


CTAATGGATC 


14100 




ATTACCATCT 


ACTTGTTCAC 


CATGTATACC 


GTAACCAAGT 


GCTCTATCCG 


ATAATTTTTC 


14160 




AGCTGCGTAT 


TGTAATGAAT 


CAGGTACTGA 


AATTGCATAT 


TTATTATTTA 


TAATGACACA 


14220 


4S 


TACAAAAGGA 


AGTTTGTGTA 


CACCCGCGAA 


GTTTAAACCT 


TCATGGAAGT 


CACCTTGGTT 


14280 




TGAGCTACCT 


TCACCAACAG 


TTGCTGTTGC 


AATTTTCTTC 


TTACCATCCA 


TTTTTAAAGC 


14340 


SO 


TAAAGCAGCA 


CCAACAGCAT 


GGGGTATTTG 


AGTTG CTACC 


GGTGAACTTT 


GAGACAAAAT 


14400 


ATTCTTAGCT 


CTACTACTAA 


AGTGTGATGG 


CATTTGTTTT 


CCACCAGAGT 


TAACATCGTC 


14460 




TTTCTTTCCA AACGCTGATA AAAACGTATC ATACGCTGAG 


ATACCCATAT 


AAGTAACGAA 


14520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GG AATTTTAC CTGCACGGTT 14640 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14760 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14820 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 14 880 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14 940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCTTCG TTTGTGTCAT GG CTATCAAT 15000 

CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

AGGCATCATG TTATAGTTTA CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC 15180 

AACACCTTCT TTTGATCCAA CATGTGC CAA TTGTAATTTT CCTATACAAT CACCAGCTGC 15240 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

TAGCAACACT TT AT CTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT AT AT CATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 1554 0 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA X5780 
TTCGACAGAA ATTGTG CCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15840 
GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 
ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 
ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16020 
CGATTTAGTA GGAATACAAC CTTTATGGAG ACAAGTACCT CCTAATAGTT GTCGTTCTAC 16080 
TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTATCGCA GCAACATATC CTGCAGTACC 16140 
TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTCTTACTCC TAACTAATGA 16200 
TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 
CATGATTGTC TATTTAGTTT GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 16440 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16 500 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 16560 

TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16 592 
(2) INFORMATION FOR SEQ ID NO: 54: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
75 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 6 0 

TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 120 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 18 0 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 240 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 3 00 

AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 360 

TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA G CAT ATT CAC AGATACTTCC 4 20 

AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 4 80 

35 ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 54 0 

CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 600 

ACACTCATT C AATTTAGTTC AC CATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 660 

40 ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTC CAACGT CATAACGTTC 78 0 

GCCTTCGAAG TCATATGCAT ACACTTGGTT AT CATTATTC ATACGTTCAA TCGCATCTGT 840 

TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 900 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 96 0 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 1020 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 1080 

TTCATAAACG TCAATCAACT GTTTCACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 1140 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCCG TTGAATACTG 1260 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 1320 

ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 13 80 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGG TTTAT CTAAGATAGG 1440 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCC CTA AACCAGCAGC 1500 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 16 20 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGG CTTACTT CATATAATTT 16 80 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTG CGT 1740 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT IB 00 

TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCTTTTTT AGTTGGTACC 1860 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTT AC C AGGTTCAGTT 19 80 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 204 0 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TT ATTTG TAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 22 8 0 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2340 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 24 00 

AAATCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 24 60 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTAC CGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGGTCCT 2640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TT AG ACT CAG TAGTAACCTG ACCACCACCT 282 0 

50 GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 2940 
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TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3060 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTCCaAGTTA AATTACTTGT ATAATAATAG 3190 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAGCATTTGA ATCATAATAC 324 0 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3 300 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CAT C TTCAAA TTTGCTGACA 3 36 0 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

15 CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 34 80 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3 54 0 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3600 

ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3660 

ATTTTGGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3720 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3780 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3 840 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3 900 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3 960 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4 020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4 0 80 

35 TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 414 0 

GGTTGAGCTA TGTCAACTTG AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4200 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4260 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4 320 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 4380 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 4440 

TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4560 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4680 

ATTATTTGAA ACTCACATAT AT ATTG CAT A CAAAGCTCTT GAACACCTTG ATATAACAGG 4740 
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T ACT AAAC C A TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4860 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4 920 

ATGATTATTT ATACAAAAAC AG CCGTATTT CAAGCCGACA TTTTAAATTT AACTAAATTT 4 980 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT 504 0 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 5X00 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

TTGTGATTCT TTTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 5280 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AGATTTCTTA 5340 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 54 00 

GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 5460 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5580 

GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 5640 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGGAAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CACCGAATTG ATAACTTGGT 5760 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 5820 

GGCACACTGT CGAAGTCGAT AT CAATG ATG TTACCGCCAT GTTCATACTT AGGTTTGTCT 5880 

TTTTCTGTAT CTTCCTCGAA TGACTGATTA CCTTTATTTT GACCATGAAT TTGAGGTACA 5940 

CTATCAAAAT CGaTATCTAC GATATTGCCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

GTGTCTTCCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 6060 

TTAATATCAA CGTGG CTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 6180 

GGTCCTTGTG CTTGACCATG CTCTTCAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 6240 

tCAGTTGTAT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTTAG ACTCAGTAGT AACCTGACCA 6360 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 6420 

TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 6480 

ACGTGACCTG c t TCGCTATC CACAG CAGT A TGGTAATCGA TATCAATAGC TGATGAATCC 6540 
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TGCTAATCAA 
TTATCATATT 
TTTTGAATAA 
CCATTATCCC 
TGTCCTACCA 
TAGTGAACAA 
TGTAAATTCA 
GTATTTGCAT 
ACTTTTGGTT 
CTTGTCGTTT 
TTATTAAATG 
TATTTAACAT 
TTTCCATTAG 
ACATCAACCT 
ACTTCACCTG 
GATACGC CAT 
AAACCATTCT 
ACTTTATTTG 
ACATCTGTAC 



TGTCAAGAGT 
GACCTGTAAG 
TCGGACCATT 
AAGTTAAGGT 
TTTGTGTTCT 
CATAAGTTTT 
AATTCCCACT 
ATACACTCTT 
GATTTCCATT 
TACCATTATT 
TCTCAATCGA 
CTAATTCCTT 
TTTGTACAGT 
TATCTTCAAT 
TCGCCATTAC 
GCGTATTTAC 
CAAATTTCAA 
TATTGTTATG 
CCGTTTCCAC 



TGATGAATCA 
AGTTTCTTTA 
TTTCTCATTT 

ATATCCTCTA 
AAAATCAACT 
ATCTAGATTT 
CATATTACTT 
CGCTATGTCT 
CTGATTACTA 
AGGTTTAATA 
TCCATTTAAA 
TGAAGTTTGT 
TTTAGGATCA 
ATCATTTGTA 
GACTGAACCA 
ATTATTTGAT 
CTTATATTTT 
CCCCTCAATA 
TTTCGCGTTA 



TATTCCTCTT 
ATTGTATCTT 
CCGTTCGCTT 
TCATAATAAT 
TCATCAGTAC 
TCTATATTCA 
GTGACTTCTT 
TCATTATTAC 
CCTTTCATTA 
AATGCAACAT 
TTGGCATAAT 
TCTTCATTTA 
ATAAATAAAT 
AATGTATATC 
TTTTTAATTT 
AAAGTAAAGT 
AGTACCGCTC 
GAACCAATTT 
CTAGCTTCCT 



CAACAGTAGT 
CTTTATATTC 
TATTACTGTA 
ACTTATAAAG 
CATTTAAATA 
ATGAATAGCT 
TAAATTTAGA 
CCAAGTATTC 
AAGTTCCAGT 
GCGAAAATCT 
AATTCCCAAT 
GTGTTGAAGT 
TAATTTCTAG 
TAATCTTTCC 
CTGGTACTTT 
CAAAGTAGTC 
GTTGTCCTGC 
CTACTGTAAC 
TAGCTTCCGC 



TACTAAATTC 
AAATTTATTA 
TAAAACTAAA 
TTGCTCTGGA 
CTCTCCATCA 
TCCATTATTT 
AGTATCTGTC 
AAATATCCTA 
AACAGTCACA 
ATTATTCGCT 
ACCATCTTTA 
TATAGTTTGA 
TTCAGCCGTT 
ACCTTCTAAA 
TCTAGCAGTT 
ACCTTGATGT 
ATGAGGTTCT 
TTTACTTGTT 
TACATCTGCT 



6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
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GATCTTGTCA 
GTTTCTGCGA 
CCGCTATTGT 
ACCTCTTCTT 
TTTGGTGCTT 
TTATATGATT 
GTTTTATTAT 
GCTGCAGCTT 
GCTGCTCCCA 
CCCTTTAAAT 



CACGTGGCTT 
CTTGATTTTG 
CTTGAGATTG 
TAACTGTTTC 
CTTCAGTTGT 
GAGTTTCTTC 
CAGTAGCTGA 
CTTTGTCTTG 
ATTTATGTTT 
GCAAAATTCA 



ACTTTCTGAT 
TGTAGCCTTT 
TGTTGTTTCC 
TATATTTGCT 
TACTTGTGTT 
TATATGATTA 
ATT CCCATTT 
TCCCATCCCA 
TCTAATGCCG 
TTAATTTTTT 



GCCGTTCTTG 
TTAGGTGTTA 
TTAACTTGAG 
GGTTGTGCAG 
GCGTTTGACG 
ACGTTAGTTG 
TCTTCTACTG 
ACAACGATCA 
TACCTAAGAT 
AAACTTAATA 



GCTGTGCCAC 
AATCTACTTG 
GTTTCGCTTC 
TTTGTGGTGC 
GTTGTTCTGT 
CAGTTGTTTG 
TAGTTGTCTT 
TTGTTCCTAA 
TGTTTTTCAC 
AATGCAAGTC 



TTCAACTTGT 
TCTTTGATCT 
TTCCTTAACT 
TTGTACTGCT 
TACTGTTGCG 
TGTTTCACTT 
TTGTTCTGAT 
GAATACTGAT 
TATAATATCT 
TATATTGTTC 



7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 



55 



SNSOOCtD: <EP 0786519A2 I > 



417 



EP 0 786 519 A2 



w 



15 



20 



25 



30 



35 



40 



45 



SO 



ATGTTAATTG ATAATTTTAT TATTTGAAAT ATACCTATAA ATTGTATTCA AGTCATCAGA 84 60 

AACCCTTGTC ACACAAGGCT TGTATTTTTT ATACTTATTT TTTAAATTAA ATTCATCATT 8S20 

AT CT AATTT A AAACAATATA CTAAACGTTT CATAATTATC GCCTGTACAA TACGCACAAA 8580 

AACATGTCTT GAAACGCCTT TCATTACTCT AAAATAC CCA ATATACTTTT TAT AT CGTTC 864 0 

GGATTCTGAG TATTTCAGAC GATTTTCTGC ATAAAAATAA ACGTGTTTCA AGGCAATATA 87 0 0 

TTGCAATTAC CTAAAAACAC GTTTACTTAA TATTTAGTTA AACAAATAAG CTAATGAATA 87 6 0 

AAATGAAGAT GATACCTGAA ACGGAAATAA TCGTTTCTAA TAATGACCAT GTTAAGAATG 88 2 0 

TTTCTTTTAC AGTTAAACCA AAATATTCTT TAAACATCCA AAATCCTGCG TCATTTACAT 8880 

GAGACAAAAT CACACTACCT GCACCTATCG CAAGTACAAC TAATGCAACA TTTACATCTG 8 94 0 

ATGATTGTAA TAATGGTAAG ACAATACCTG TAGTTGAAAT CGCAGCTACT GTAGCCGAAC 9000 

CTAATGCGAT ACGTAGCACA GCTGCAACAA TCCATGCTAG TAAAATCGGA GACATCTCTG 9060 

TACCTTCAAA CATTTTAGCA ATTGTATTTC CGACACCGCC GTCAATTAAT ACTTGTTTAA 9120 

ATGTACCGCC ACCGCCAATA ATCAATAACA TCATTCCGAT TGGATAAATC GCATTCGTCA 9180 

CTGATTCCAT AATATGATTC AT CTTACGCT TTCTCATTAA TCCCATCGTA ACGATTGCAA 924 0 

ATAATACTGC TATTAGCATG GCTGTCCCTG CTGTTCCTAT CATATAAATG ATAGATTCAA 93 00 

ATAGATTTGT AGGTTTGTCA TGCCCAGTTA CAAGTTGCGT TATCGTAGAC ACTAACATTA 9360 

ATATGACTGG TAATGTTGCT GTTAATAAAC TCATACCAAA TCCTGGCATC TCTTGATCCG 94 2 0 

TAAATTCTTT TTGTGCACCT AACGCTGAAA TATCGCCTTC TCGTGTATAC GCAGACGGAA 94 8 0 

TCATTTTTTG TGCAcTTTGT TAAATATAGG CCCTGCAATG AGTGTAACTG GaATGGCAAT 9 54 0 

AATCATACCA TACAGTAATA CATCTCCAAC ATTTGCCTTT AATTCTTTTG CGATGACTAC 9600 

CGGTCCTGGA TGTGGTGGTA AAAAGCCATG TGTCACTGAT AAAGCTGTTA C CAT AGGTAG 96 6 0 

TCCTAGTTTT AACACTGAAA CATTTGCGCG TTTTGCTACT GTAAATACTA ATGGAATCAG 9720 

TAAGACTAAA CCTACTTCAA AG AACAATG C AAT AC CGACG ATAAATGCTG CAACAAGCAT 9780 

TGCCCATTGT ACATGTTTTT GACCAAATTT TTGAATCAAC GTGTCTGCGA TTCGAGTTGC 9840 

ACCACCACCA TCAGCAAGCA ATTTCCCAAG TATGGCACCT AAACCGAATA TCAGTGCAAT 9900 

GTGGCCGAGC GTACTGCCCA TTCCTTTCTC AATCGTCTCC ATAATTTTAG TCAATGGTAT 9960 

ACCTAGCATT AACGCTGTAA TCATCGATGT GATAATTAAT GAAATAAATG TATTTAATTT 10020 

AAACCCAATA ATTAATACTA ATAAAATAAC GATACCTAAA ACAACACTGA TTAACGGCCA 10080 

TATTTCGTTA AACATGACAT TCCCCTCTTT CTCTTTTCAA TAGAATGTAA CACCGTCGTC 10140 
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GAGTGACGTA TTTATTGTGT TTTATTTTCA GCGATATGTT GG CGTTG AAA ATCTGCAATT 10260 

TGTTCATAAT TCTCTGTTAA AGAACGACTT AAATTGATAA AAATGGATAC GATCTCTTGG 103 20 

TAAACAGTGA CATTTTCTTC AATCGGCGTA TGATTGTTTG TGGCACCGAC CATCGATGAA 10380 

ACGATTGAAA AATCTTCAAT GTCACCTACA GCTTTAAGTC CGAGCACGCA GGCACCTAAG 10440 

CATGAACTTT CATAACTTTC AGGAACCACT AACTCTGTGT CAAATATATC TGACATCATT 10500 

TGACGCCATA CTTCACTTTT CGCAAAACCA CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

TTCATTACTT CAATAAGCGC AAGATAGACG GTATACAAAT TGTAAAGAAC ACCTTCTAAT 10620 

15 GCAGCGCGAA TCATATGTTC TTTTTTATGA GATAAAGTTA AACCGAAGAA TGAACCTCTT 106 80 

GCATTTGCGT TCCAAAGCGG CGCACGTTCT CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

TCTGCACCTG GTTTAACACG CTTTGCAATT TGAGTTAAGA CATCATAAGG ATCAACACCG 10800 

20 AGACGTTTCG CAGTTTCGAC TTCACTCGCT AG CAACT CGT CGCGCAACCA TCTCAATACG 10860 

ACACCACCAT TATTTACAGG ACCTCCGATG ACGTAGTGGT CCTCTGTTAA GACATAACAA 10920 

AATATTCTAC CTTTGTAATC AGTACGCGGT TTATCTATCA CAGTACGAAT CGCCCCAGAT 10980 

GTACCGATTG TGACAGCAAC TTCTCCTTTA CCAACACTAT TGACACCTAA ATTAGAAAGG 11040 

ACCCCATCAC TCGCACCAAT AACAAACGGT GTATCTTTAT TAAGCCCCAT TAATGTTGCA 11100 

TAACGTTCTT TCATACCTTT CAtCACATAC GTTGTTGGAA CTAATTCCGG CAACATTTCC 11160 

TTGGAAATAC CCAGCAGTTC TAATGCCTCA ACATCC CAAT CTAATGTTTC TAAATTAAAC 112 2 0 

ATCCCTGTTG CGGAAGCCAT TGAATAATCA ATGATATATG TATCAAATAA ATGATAGAAA 112 80 

ATGTATGTTT TAATATCTGC AAACTTAGCA GTACGTTGAA ATACATCTTG CCATTCATGT 1134 0 

35 

TTCATCCAAA AAATCTTCGC TAATGGCGAC ATAGGATGAA TCGGTGTGCC TGTTCGCTGG 11400 

TAAATCGCAT TGCCATCATG CACTTCATTT ATTACTGTTG CATATTTTGC AGCGCGGTTA 11460 

40 TCTGCCCAAG TAATATTATT TGTTAATCTT TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

TGCATTTGCG CACTAAATGA CACAAACTTA ATGTCGTCTT TATTAACTTT GGATTCTCTC 11580 

ATAACATATT TAATAGTCAT TAGTACTGCA TCAAATAATT CATCTGGGTT TTCTTCTGAG 1164 0 

45 ACATCAACGT TTGGTGTGTG TAAATCATAG CCTATTTGAT GTTTCATGAT AAAAGTTCCA 11700 

TTTTCATCAT ATAAGACTGA CTTGGTACTC GTCGTTCCAA TGTCGACACC AAT CAT ATAT 11760 

TTCATGATAA ATCCTTCTTT CTTTCATTTT AATTCAACCA AAATCCTTCA ATATCTTTAC 11820 

SO 

CAACATCGTC GAAATTTAAA TGAAACGCTT CTTTCAAAAT TTGACTGTCG TATTGTTCCA 11B80 

CTGCATCAAT AAACACTTGA TGATTATGAT GTATGCGTTC AAAATCTTGC GGGTTCTGTT 11940 
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AAAATGAGTT TAAATATTGA TGATTAGATG CTTTGATTAA TGTTTCATGA AATTCAAAGT 12060 

CATGCTTCGT AAATGATTCT GCATCCTCAA ATTTTACTGC CACTTTCATC ATTTCAAGTT 1212 0 

GTTTCTTCAT TTCTTTTACG ATAGGTAGTC GCTCTTGATT TTTAACTCTT GAAAATGCAA 1218 0 

ATGACTCTAA CATCAGTCGC AAATCATACA TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

ACACATGTGC ACCCATTCTT TCTAATTGGA TGAGTTGATT TTGTTGCAAT AATTTAAATG 123 00 

CATCTCGAAT TGGCGAACGA CTCACATTAA ATTGCTTTGC CATTTGATTT TCAGTGAGTA 12360 

ACGT AC CTTC AGCTATGTGA CCATTCACAA TGCCTAAGCG TAATTCTGCC GCGATACCTT 12420 

CTCCAGTTGT CATACCTTCC AACCATTTCT CTGGATATCC ATACATCATC AAAGTCACTC 124 80 

CTTCATTACA CGACATACTT GTATACAAGT ATGTTAATAT AGTTATTATG AGTTTGCAAG 12 540 

CGCTTTCTTT ACGAGCACTA AAATAGTGAC CACCCCTTTT CGATTTAAAT TTAAAGGAAA 12600 

TGGTCACTAT CACACGAATG ATTTAATTGT TATGTTGTAT GTGGGATATT TCTAATTGTT 12660 

CTGTACTCAT ATGCGCTTTA GGTACTTCAA TGCAATAATG CGTTTCATGA CAGTTTGGAC 12720 

ATTCGAATCG ACGTGTTGTC GCTGTATGTT TCG CTTTGAT AACTGCCCAC AAAGATGGTG 12780 

AGAATATATG CTGGCAGTTA GGACATAAAT AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 12 84 0 

CACCAATGCC ATAAC CAATC ATAAATGGTA AAGCAATTAA AAACGG CCAT TTATTTTTCA 12 900 

TCAAAATTGC ACTTATAATG CTAGAATATT GAATTATTCC TATAATACCA GCACTAATCC 1296 0 

AAATGTTACG ACGAATACTT TTCATTTCAG CTGATTTACT CATGACATGC TCTATGTCTT 13 020 

TTAAGTGTGT GATTGGAGAC GTCGACGCTT CATTTACGTA ATATTGAACA TTTTTAATTT 13 080 

TGTTTAATAC CGCTTGTTGC TGTTTAACTT GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 1314 0 

GTAAAGTATT GAGCGTCTTC AAAGTACCTT CACCTTTTAG CAACATATCT ATATCGCTTA 13200 

ACGG&CAACC TAAATCTTTA AG CAAT AAG A TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

TATACACACG ACGCTTTCCT TCTGTAAATC CTTGTGGTTT CAAAATACCT TTGCGATCAT 13320 

AATATTGAAT CGTTCGTGTT GTCACATTGC ATAATTTTGC GAGTTCTCCA GTCGAATAGT 133 80 

TAGACATAGA TTCCACCTCC TATAATTACC ATAGTTGATG ACCCGACGTC ACGAGCAAGT 13440 

ACAATTTCCA CATTTTAAAG AAATTTATTA TACTAGGCGT CTTATTTTTA TGATTTCGTA 13500 

CCATGTTGAT TTACAAACTC ACTCAAACTA AGTAACACAC CTACTAAACA TCTACTCTGT 13560 

TATTTCAGAA TGAATTTGTT GTAATTTATC TTCAACTTCA GTAATCTCTG TCGCACATTC 13620 

TTTCAGTAAA TCTCGATACT TTTCCGTCTC TGCATTGTTT TTATAACGTA TTTTATGTTC 13680 

TAAACTTGcC CACATATCCA TACCTATCGT TCTAATTTGA ATTTCAACAG GCAATACCTC 1374 0 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1059 bas pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 



15 



20 



25 



30 



-35- 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCCTTT 60 

TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT 120 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 24 0 

TACAGATATC TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 3 00 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GACCAAACGA CATGCATTCG T C GT AT AG CT 3 60 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 4 20 

TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTCCAGT TGCTCCTGTA 4 80 

ATGACATCAC AT ATT CAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 54 0 

AG CT CTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 6 00 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG GATACTGTGC 660 

CGTAACCTGT T CG AT AAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 720 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780 



40 



45 



SO 



GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA 
GAATSTTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT 
TCAT ATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT 
TCTCTCCCTT TTTCTTCATC TTTnATtTGA CTCTnCATAT ATTCAACTTC 
nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30246 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



CGGATCACGT 
TCTTTCCTGT 
TCTGATTTTT 
TTCTGTAGAT 



840 
900 
960 
1020 
1059 
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GAAGTAAAAG 
ATTGTAGGTC 
TTAACCAAAA 
ATTGATAAAG 
GGTGGTAAAC 
TTCGATGTTA 
GGAAAATATT 
CAGG CAATGT 
TAACCAAGAT 
AATAGGTAAA 
TAATGAGATT 
ACAAGGATAT 
GTTAACTATA 
AATGATGAAT 
GACCTAGTGA 
GCGATTAATT 
CCGTAATCCA 
AAATAAATTC 
TGAAAAACCA 
CAGATGTCGC 
TAACGTTgGA 
TATCACAGCT 
GCCAATATTC 
ATGACCGAAG 
GCtGgATAAA 
CGCCAATAAT 
ATACAATACT 
CCTTTTTACC 
CATT CATTGT 



AAGAATTAAA 
CTGCATATCC 
TCGATTGGGA 
ATAATGATGC 
ATGACAAATA 
GAAGCTTTTC 
ACAGCTGCAA 
GTGGCCTTTT 
GCAGTATTTT 
GTGGTAATGG 
GTGTTAACGG 
CAATATATGG 
AAATAAATTA 
AAAGGTGCTT 
ACAATTGACA 
GAT AG ACT CA 
AGCCGTAATC 
TTGGGCAAAT 
AATAAATAAA 
TAAAATTTCT 
TTCACTTGAT 
GCAATAACAG 
ATTGAATACA 
TAGACCGATA 
TGCaGCTTGT 
AATGCAGAAA 
AATAATTAAA 
TCCGAAAATT 
TTCGCCCTCC 



TTTAACATTA ACAATGGATG 
ACAACAGGAT ATGTTAACTG 
AAACGTAACT ATCAATAATG 
GTTGATTGCG CCTGCTGTCA 
ATGACACCAT CATGTTACGA 
AATTAAGTGA AAGTGATTTG 
TGTCTGATAA TGAAAGGTAT 
TTACATTACA TCGTGGAAAA 
TCAGGTCATT TAGTGTTGAT 
AAAAATTGGC GTCATTTATC 
TTAATACTGA CAATCCACAT 
GAGATAGTAT GTTCGTCGGA 
AATTTAAAAG CATCTTTACT 
TTTGTTATAG ATCAT CGGAC 
TATATCCACA GGTCGCTTAA 
TCATTTTTGC GCTGTCGAGA 
GGAATACTGA TTGCAACGGC 
ATTTTCGAGT TTATAATATG 
GCAAGTTGGC CACCAAAAAA 
CTACCAACAC GCATGCCAGA 
GCAATTGATA AATGGGTGAA 
CAAGAATAAT AGTGAACACC 
CATATGTTTC ATCTTGTTGT 
AATAAATGAG TGTAATCAAC 
GTTTTAACAT TGTAACTATT 
AAGAATGTGA CGACATAAAT 
ATAGCGAAAT TTAAAAATAG 
ATCATCAGAA AG AGG AG CAA 
TTAATGTTTC AAATATTTCC 



AAATTGAATA 
AGTTAAATGG 
AAATTACGGA 
AAGTTTGGAT 
CATT ATGTC C 
AAGTTTGTTA 
CCCATCGTTG 
GGGGTCGCAC 
CAACGTTATC 
ACTTCAACAT 
GCCATGGCAC 
AGACCTGTTC 
CATCGTCGAC 
AATTTACTAT 
CTTAAGTTAT 
TGGTCTTTTT 
AATACCGCCT 
AC CAAATGAA 
GGCAAGGTAA 
TTGGAATAAT 
CTAATGGTAA 
ATAAATTGAA 
TCGGTTGaAA 
AATATTGTTG 
GAGTACGAAT 
CGGTAOGCCA 
GGTTAAATAA 
TAACGCCAAT 
ATAAACAATA 



TGTCGGGACA 
ATTTCGCGCA 
TATACGCTGG 
TGAAACTTAT 
CACAAGATTA 
AAACGCCAGA 
TAATGGATGG 
CATTTAGCGA 
GTAATAGAGG 
TTCAGGATAT 
TTTATCGCCA 
ATATTATGGC 
CACAACAATT 
AGTAAAAAGC 
ATTG CTAGTT 
ATTAAAAATG 
AAAATAATAG 
TATTTAAGTT 
ATCGTGTTCG 
TCGTATTGCG 
TTGTTAAATC 
CCATATCAAT 
AGCCTTGTAG 
TAACGATAgT 
AAATTACAAG 
AAAATAATCA 
GAGATGAATC 
ATAAATACAG 
TTGTGATAGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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CATCGAAATA 


GTATAAGTCA 


CTGTATTGGC 


ATTTTTTAAA 


AAGATTAAAA 


ACATAGGTAG 


1860 




TGCACCGGAT 


AAATATGAGA 


ATAATAAGAT 


GTTAGTCATT 


GTTCCCATAA 


TATCTTGGCC 


1920 


s 


GATGTTTCGC 


CCAGCAAGCG 


CCCATCTCCT 


CATTGAAATG 


TGTGGCGTAC 


GCTGTAAAAT 


1980 




TTCATG CAT A 


CCACTAGCAA 


TTGTAATTGC 


AACATCCATA 


ATAGCGCCAA 


GTGAACCTAT 


2040 


10 


TAACACTGAG 


GCTAGGAAGA 


TATCTTTCGG 


TGGTAATGAT 


AAAAAGTTCA 


TCGTTTCATA 


2100 


TTTAATGCCT 


TT AC CAT CTG 


TCATATATAT 


GATTAATTCT 


GTTAAAC CTA 


TACTCAAAAA 


2160 




AGTTCCGATA 


ATTGTACTGG 


CTATGGTAAT 


GAGTGTACGC 


ATATGCCAGC 


CTGTAACGAG 


2220 


15 


CAATAAAGTG 


AGTATTGTTG 


AACAGATCAT 


GGCAATGGTC 


ATGAGTAAGA 


ATAAATTAAT 


2280 




ATTGCTATGT 


TGAATATGAA 


TGTAAATTGC 


GATTAATATG 


GCAATAGAAT 


TCAAGATTAA 


2340 




CGATAAAATC 


GATTGCAGTC 


CGACTTTGCG 


ACCAACCAAT 


AATACAGTTA 


ATAAGAACAA 


2400 


20 


ACCAGTGATG 


ATAACCGTTA 


AGGTATCACG 


CTTCTTTTCT 


ATAATATAAG 


CATCACTCGG 


2460 




CTTGTTAGAA 


ATATGTAATA 


ATACTTTTTC 


GTGTGTGCGA 


AATGCCTCAG 


AATCTGCTTG 


2520 




CGATTTGACG 


TACTGATGAT 


TAATCGTCGT 


CGTTTCTCCA 


G CAAATTG AC 


CATTTAATAT 


2580 


25 


TTTGACTTTT 


AATTGATTTT 


TATATTTAAT 


ATCACGATTA 


TTTTGTGCAT 


CTTTTGTAGG 


2640 




TGTCGAAGAA 


ACATGTTTGA 


CATCTATAAT 


TTGACCAATT 


GGTTTGTTGT 


AAAAGTTCTC 


2700 




ATTATTGAAT 


GTAAATAAAA 


TAGCACCAAT 


GAATGCGATG 


CAGAACAAAC 


CTAAAATTAT 


o n & n 


30 


ATTAAATGGC 


TTTGTAAATA 


AATTTCTATA 


TTTCAAAAAC 


AAAACCCCAA 


TTCTATGAAT 


2820 




GAATTAATAT 


GGTGATTATA 


CGCCCTTAAT 


TTTTTATTTT 


CAAAGATATT 


ACTGCTAAGT 


2880 




GTAAAACGAA 


AATCATCATT 


GATAGCATCG 


AATTACTTAA 


TGGAATGTAG 


ACGTTTTAGT 


2940 


35 


GATTAATTGe- 


TGAATAAGTG- 


TTAATAATAT" 


GCCAATATCA" 


CTCTTTGTAT " 










AAGGCTCCTT 


3000 




TGT^ATAGCA 


CATATCGTTC 


TTTTTAATTC 


AGTATGATCT 


AATTTTATAT 


CTATCCATGA 


3060 


40 


TTTAGATTCT GGTAAATGTA 


TATTTTGTGA 


TGAAATGATG 


TAACCTTCTT 


TTTGACGAAG 


3120 


GAGATAcTGC 


GCAAGTGGTT 


GGCTACTGAT 


TGTGTATACA 


TCTGATTTAG 


TAATCTTGCG 


3180 




CAATTGTTTT 


TTTACAGTTT 


CGGCAAATGG 


TGCCAAGCAA 


TAAATATGAC 


TATGCTCAAA 


3240 


45 


CTGAATTAAT 


GGTGGGTGTG 


TCGCCATCGT 


AATTGGATCG 


TCTGAAGGCG 


CATATAAATG 


3300 




ATAGTGCTCT 


TCGAATAAAG 


GTAGCATATG 


TAATTGTTTG 


TGTTTACGTA 


TTTCTGGTGT 


3360 




AAGTTCCGTG 


AAACCAATGT 


CTATATTCCC 


ATTTAATACG 


CTATTTATAA 


TTGTGTCATG 


3420 


50 


TTCTAATAAG 


CTCGGTATGA 


CATGTGTATC 


ATTTTGTAAA 


TGAAACGTTT 


GGATAAGTGG 


3480 




TAGTAACATG 


TGGGATACGT 


CACTCTCATC 


ATAGCCAATG 


TAGATACTTT 


TATTTTTAGT 


3540 
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TTCATTAAAT AATAATTTCC CTTCAGATGT GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3660 

TAAAGACACA TTAAGTTCTT GTTCTAATAA TGTAATTTGA CGGCTTATCG CTGATTGAGC 3720 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3780 

TCTTAATTGT TTAATTTCCA TAG CG AT AT A GGCACCTCCA AAAATGAGTG TTTTGTAACT 3 84 0 

ATTATAGCAA TATTATTGAT AAATGTTCTA TTTTTTAGAT GAATATCTTC TATTTTATAT 3 900 

ATTGAACAGA TAAATTTTTT AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3 96 0 

TCTAAAAAAG GGGTGTGCAT CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4020 

CGTGAGGAAC ATGATGCGTG TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 408 0 

CACGACATCA TTGATAAATC GCTTGAAATG TTGCGACGCT TAG AT CACAG GGGCGGGGTC 4140 

GGCGCAGATG GCATCACTGG TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 42 00 

TTCAAACAAC ATGTAACGGA CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4 3 20 

GAAGGCGAAG GGTTATCAAT TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATG CCATT 43 80 

GCTAAACATG TAG CAG AT AC GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 4440 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4500 

CAGTGCGATT TAGAATTGTA TTTTACGAGC TTATCACGCA AAACAATTGT ATATAAAGGT 4560 

TGGTTACGAT CAGACCAAAT TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4620 

TCAAAG CTAG GGTTAGTGCA TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4 6 80 

GCACATCCTA ACCGTATGTT AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

AACTGGATGC GAGCACGCCA ACATAAATTA ATCGAAACAT TATTTGGCGA GGATCAACAT 4800 

AAAGTGTTTC AAATTGTCGA TGAGGATGGT AGTG ACT CTG C CATTGT AG A TAATGCGCTA 4 860 

GAGTT CTTAT CGTTAGCCAT GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4 920 

TGGTTATATA ATGAAGCGAA TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 4 9 BO 

TTAATGGAAC CGTGGGATGG TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 5040 

GCGCTTACAG ATAGAAATGG ATTACGTCCA GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

ATTGTCTTTT CATCTGAAGT GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GGTCAATTGA ATCCTGGAAA GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 5220 

AATAATGATT TAAAAGGTGC GATTGCTGGA GAATTACCAT ATAAAGCGTG GATTGATAAC 5280 

CATAAAGTTG ACTTTGATTT TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 5340 
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CAGGAACTTG 


TAGAAGGTAA 


GAAGGATCCT 


ATCGGTGCAA 


TGGGATATGA 


TGCGCCAATT 


5460 


5 


GCAGTGTTGA 


ACGAGCGACC 


AGAATCACTA 


TTTAATTACT 


TTAAACAGCT 


GTTTGCACAA 


5520 




GTTACGAATC 


CACCAATTGA 


TGCGTATCGT 


GAAAAAATCG 


TAACGAGTGA 


ACTTTCTTAT 


5580 




TTAGGTGGCG 


AAGGTAACTT 


ACTAGCACCT 


GACGAAACGG 


TTTTAGATCG 


TATTCAATTG 


5640 


10 


AAAAGGCCGG 


TATTGAATGA 


ATCACACTTA 


GCAGCGATTG 


ATCAGGAACA 


TTTTAAATTA 


5700 




ACTTATTTAT 


CAACGGTATA 


TGAAGGGGAT 


TTGGAAGATG 


CGTTAGAAGC 


ATTAGGCCGA 


5760 




GAAGCAGTGA 


ATGCTGTAAA 


GCAAGGCGCT 


CAAATTCTAG 


TGTTAGATGA 


TAGTGGATTA 


5820 


15 


GTTGATAGCA 


ATGGCTTTGC 


AATGCCGATG 


TTACTCGCAA 


TAAGTCATGT 


GCATCAATTA 


5880 




CTTATTAAAG 


CAGATTTACG 


TATGTCTACA 


AGTTTAGTCG 


CTAAATCTGG 


TGAGACACGA 


5940 




GAAGTGCATC 


ATGTTGCTTG 


TTTACTCGCA 


TATGGCGCGA 


ATGCAATTGT 


GCCATACCTA 


6000 


20 


GCGCAACGTA 


CAGTTGAACA 


ACTGACATTG 


ACAGAAGGGT 


TACAAGGCAC 


CGTTGTCGAT 


6060 




AATGTTAAGA 


CATATACGGA 


TGTATTGTCA 


GAAGGTGTCA 


TTAAAGTAAT 


GGCTAAGATG 


6120 




GGAATTTCGA 


CAGTGCAAAG 


TTATCAAGGG 


GCACAAATAT 


TTGAAGCGAT 


TGGCTTGTCT 


6180 


25 


CATGATGTGA 


TTGATCGTTA 


TTTTACTGGG 


ACACAGTCTA 


AGTTATCTGG 


TATTTCGATT 


6240 




GATCAAATTG 


ATGCTGAAAA 


TAAAGCACGT 


CAACAAAGTG 


ATGATAATTA 


TCTTGCATCA 


6300 


30 


GGTAGTACAT 


TCCAATGGAG 


ACAACAAGGT 


CAACATCATG 


CTTTTAATCC 


GGAATCTATT 


63 6 0 


TTCTTATTGC 


AG CACGCATG 


TAAAGAAAAT 


GACTATGCGC 


AATTTAAAGC 


ATACTCTGAA 


6420 




GCGGTGAACA 


AAAATAGAAC 


AGATCACATT 


AGACATTTAC 


TTGAATTTAA 


AGCATGTACA 


6480 


35 


CCGATTGACA 


TCGACCAAGT 


TGAACCGGTA 


AGTGACATTG 


TCAAACGCTT 


TAATACAGGG 


6540 




GCGATGAGTT 


ATGGATCGAT 


TTCAGCGGAA 


G CACATG AAA 


CGTTAGCACA 


AGCCATGAAC 


6600 




CAATTAGGTG 


GAAAGAGTAA 


TAGTGGTGAA 


GGTGGCGAAG 


ATGCAAAACG 


TTATGAAGTA 


6660 


40 


CAAGTTGATG 


GAAGCAACAA AGTAAGTGCG 


ATTAAACAAG 


TTGCTTCTGG 


GCGTTTTGGT 


6720 




GTAACTAGTG 


ATTATTTACA 


ACATGCCAAA 


GAAATTCAAA 


TTAAAGTTGC 


GCAAGGTGCA 


6780 




AAGCCTGGTG 


AAGGTGGTCA 


ATTACCTGGT 


ACTAAGGTAT 


ATCCGTGGAT 


TGCGAAGACA 


6840 


45 


AGAGGGTCAA 


CGCCAGGTAT 


CGGTCTGATT 


TCACCACCGC 


CACATCATGA 


TATTTATTCA 


6900 




ATAGAAGATT 


TAGCGCAACT 


GATACATGAT 


TTGAAAAATG 


CG AATAAAGA 


TGCAGATATC 


6960 




GCGGTAAAAT 


TAGTTTCGAA 


AACAGGTGTT 


GGTACCATTG 


CATCTGGGGT 


GGCAAAAGCA 


7020 


50 


TTTGCAGATA 


AAATTGTCAT 


CAGTGGTTAC 


GATGGTGGTA 


CAGGGGCTTC 


ACCCAAAACG 


7080 




AGTATTCAGC 


ATGCCGGTGT 


TCCTTGGGAG 


ATTGGTTTAG 


CAGAAACACA 


TCAAACATTA 


7140 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA GCGGAAGAAT TTGGATTTGC AACTGCACCA 7260 

TTAGTGGTGT TGGGCTGTAT TATGATGCGT GTATGCCATA AAGATACATG TCCAGTAGGA 7320 

5 GTTGCAACTC AAAACAAAGA TTTACGTGCT TTATATAGAG GTAAAGCACA TCATGTTGTT 7380 

AATTTTATGC ATTTTATTGC ACAAGAATTA AGAGAAATTT TAGCATCTTT AGGTTTGAAA 7440 

CGTGTAGAAG ACTTAGTTGG AAGAACTGAT TTATTACAAC GATCATCAAC ATTAAAAGCG 7500 

10 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7 560 

ACAAAAGAAA TTCAACAAAA TCATAATCTT GAGCATGGAT TTGATTTAAC AAATTTATAT 7620 

GAAGTAACGA AGCCATATAT TGCTGAAGGG CGTCG CTATA CAGGTAGCTT TACAGTAAAT 7680 

1S 

AATGAACAAC GTGATGTAGG GGTTATTACA GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

GCAGGACTTC CTGAAAATAC AATTAATGTT TATACGAATG GTCATGCTGG TCAAAGTCTT 7800 

2Q GCAGCATATG CACCGAAAGG CTTAATGATT CATCATACTG GAGATGCGAA TGACTATGTT 7 860 

GGTAAAGGAT TATCTGGTGG TACGGTCATT GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7 920 

GAAATTATTG CTGGTAACGT CTCATTCTAT GGTGCGACAA GTGGTAAGGC ATTTATTAAC 79 80 

25 GGTAGTGCAG GAGAAAGATT CTGTATTAGA AATAGTGGTG TAGATGTTGT CGTTGAAGGT 804 0 

ATCGGCGACC ATGGATTAGA GTATATGACT GGTGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTAAGAACT TCGGTCAAGG TATGAGTGGT GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

3° GAAGCTTTTG TTGAAAATAA TCAACTAGAT ACGCTTTCGT TTACAAAGAT TAAACACCAA 8220 

GAAGAAAAAG CATTCATTAA GGAAATGCTG GAAGAACATG TGTCACACAC GAATAGTACG 82 80 

AGAGCGATTC ATGTGTTAAA ACATTTTGAT CGCATTGAAG ATGTCGTCGT TAAAGTTATT 834 0 

55 CCTAAAGATT ATCAATTAAT GATGCAAAAA ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GAAGATGAAG CGATGTTAGC TGCATTTTAC GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

AAACCAGCCG TTGTGTATTA AGGAAAGGGG GAGATACGAT GGGTGAATTT AAAGGATTTA 8520 

40 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

AAGCATATCA ACAACGATTT ACTAAAGAAG ATGCCTCTAT CCAAGGTGCA CGATGTATGG 8640 

ATTGTGGAAC GCCGTTTTGT CAAACCGGAC AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

45 

CAATTGGAAA CTACATTCCT GAATGGAACG ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

CTTATGAACG CTTAAGCGAA ACAAATAACT TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 
SQ CACCATGCGA AAGTG CTTGT GTGATGAAGA TTAATAGAGA ATCGATTGCG ATTAAAGGTA 8880 
TTGAACGCAC AATTATTGAT GAAGCTTTTG AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8940 
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CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 9060 

GCGGTTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 9120 

GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 9180 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 924 0 

AAAAAGGTAG AGATTTACCT TTAGAAGGAC GCATGGGTGA TGGTATACAT TTCGCTATGG 9300 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9 360 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGGCAGAC TGTGTAG CGA 94 20 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 94 80 

AAGCAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCCGGTG TTTAAAATGG 9540 

ACTATGCGCA CCAAGAGTAC GAAGCTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGCG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGTAT ACTCAAATTT 966 0 

TAG AG CAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTG 9720 

AC CTTGT ATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCG AATGCTTTTA 9780 

ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9840 

AAAAGGTATT TGCTGCTGGA GATGCTAGAC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9900 

AAGAAGGTAG AGGCGTAGCG AAAGCAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9960 

AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTGAATC GAGTTTGAAA 10020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCTTTGTAAC CTAAAAACAA AGGTTTGTAA 10 080 

GACAACAAAT AGATTAATTA TAAGTAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 10140 

-GTGGAAGACA— ATGATTTGTG-GTAATCATGT— AATGCTTAAA— AA 10200- 

AACGTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 10260 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

CCATCAATAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 10380 

TTACCTTTTT TATTTGTCTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 10440 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 10500 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT GCATTTTTCA 10560 

ATTTAGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10620 

AATAACGCAA TTGTAGCGAG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 10680 

TGAAATGGCT ACCCCAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10740 
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TAAAAAGAGA 
CTGCAACGCA 
AAGACGCATT 
AAATTGTCAT 
GTGCTCAAGA 
TACAACGTTT 
CAGCTGGTTT 
AAG CACTTAT 
CGAGTACGGC 
GTGGTAGTCC 
CTCAGTATGA 
AGCAGTTGAA 
AAATTGAAAA 
CCGTAGCGCT 
TGaTTGGTAC 
GCGGAGCAAT 
TTTTAGCAGT 
TTGTTGCGAT 
AACGTCGTAA 
TAGGTGTTAC 
CTGQGATATC 
AAGTTGGTGT 
ATCTTATTGT 
ATTTTAGTAA 
TCGTTATTTG 
ATTGGAGAAA 
ATGGTATAGG 
TTGATTATAT 
TCAGCAATTA 



AGATGTAAAA 
TTGTGTAACA 
AAGTAATAAC 
TGGTCCAGGA 
AGCTTCGAAA 
GATCAAATTG 
GTTAATGGGA 
TGAGATGTAT 
ATTTATTTTC 
GATTCTAGGC 
TTTGGCAAAA 
TTACCAAGGT 
AGGATTAAAT 
TTTAGTTACT 
AGGTATTACA 
ATATGGATTG 
AGATTTCCAA 
TTCCAATATT 
AATGGTTAAA 
TGAACCAGCC 
AACGTCTTGT 
TGGTGGCGTG 
GACAGCTATT 
ACAAAAAGCG 
GACGTCCTTT 
ATCCGTTGTA 
AGATATCAAT 
TTGGTTAACA 
TTTAGAAATC 



GCCATCGTAA 
CGATTACGTT 
GCGTTGGTCA 
ACAGTCGATG 
GATGAAGCGA 
T t GGGGG AT A 
ATCAATAATT 
CCACAAATTG 
TTACCAGCAT 
ATAGTCTTAG 
GGGAATATTC 
CAAGTGTTGC 
AAAGTCGTTC 
GGATTTTTAG 
TCTGGTGTTA 
TTATATGCAC 
TTGATGGGTA 
TGTCAGGGCT 
GAAGAAGGCT 
ATGTTCGGTG 
GTATTGGGGG 
CCAGCATTCA 
GCTATTGTTG 
AAAGAAATTG 
ATTACGTTAT 
TATCAAATTT 
GGAATTATAG 
CCAGTGTATG 
aATGAAGACT 



CCGCTATTGG 
TAGTGCTGAA 
AGGGGCAGTT 
AAGTGTATAA 
AACAAGCAGC 
TTTTTATACC 
TACTTACAAT 
CTGATATTTC 
TAATTGGTTG 
GTTTGATTTT 
CGACGTGGAA 
CAGT t TTAAT 
ACGATTCGAT 
CATTTATTAT 
CATTTATATT 
CACTTGTAAT 
GCAGCTTAGG 
CTGCAGCATT 
TGGCATTAAC 
TGAACTTACC 
CAATCGTTGG 
TTTCAATTCA 
TACCATGTAT 
TTGAAGATTA 
AAGGTGGTAA 
ATCCTAAGTC 
AAAAATTGGA 
AATCACCGAT 
TTGGAACGAT 



GGGAAAAGAA 
GGATGAAAGT 
TAAAGCAGAC 
GCAGTTTATT 
TGCACAAAAA 
AATATTACCT 
GAAAGGTTTA 
AAACATCATT 
GAGTAGTATG 
AATGCATCCG 
CTTATTTGGC 
TGCAGCTTAC 
AAAAATGTTG 
CATTGGACCA 
CCAACATGCA 
TACAGGACTA 
CGGTACGTAT 
TGGAGCATGG 
ATCTTGTATT 
TCTGAAATAT 
TATGAATAAC 
AAAAGAATTT 
ACTAACAATT 
ATAAAATAAA 
TTGTGTGTCG 
GTTTAATGAT 
TTATATCAAG 
GAATGATAAT 
GGATGATTTT 



AATCTTGAAG 
AAAGTTGATA 
CATCAATATC 
GATGAAACAG 
GGGAATCCAG 
GCGATTGTGA 
TTTGGTCCAA 
AATGTGATTG 
CGTGTATTTG 
CAATTAGTAT 
TTAGAGATTA 
GTTCT AG CTA 
GTCGTTGGAC 
GTTGCGTTAT 
GGATGGCTTG 
CACCATATGT 
TTATGGCCAA 
TTTGTCTATA 
TCTGGTATGT 
CCATTTATCG 
GTACTTGGAA 
TGGCCAGTAT 
GTGATGTCTC 
AAAGGGGCGT 
AAAGAAATAG 
ACGACGGGGA 
TTATTGGGTG 
GG CTATG AT A 
GaAAAGTTAA 



10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
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CGACGGAGCA TGaATGGTTT AAAGAAGCCC GTAAATCTAA AGATAACCCy TATAGAGATT 12660 

ATTACTTTTT CAGATCATCT GAAGACGGGC CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

5 GTAATGCATG GAAGTATGAT TCTGAGACAG ATGAATATTA TTTACATTTA TTTGATGTCA 12780 

GTCAAGCTGA TTTAAATTGG GATAATCCGG AAGTACGTCA ATCGTTATAT CGCATAGTCA 12840 

ATCATTGGAT AGACTTCGGC GTTGATGGTT TTCGATTTGA TGTCATTAAC TTAATTTCTA 12900 

10 

AAGGTGAATT TAAGGACTCT GACAAAATAG GTAAAGAATT TTATACGGAT GGTCCTAGAG 12 960 

TGCATGAGTT TCTGCATGAA TTAAATCGTC AAACGTTTGG TAACACTGAC ATGATGACTA 13 020 

TAGGAGAAAT GTCTTCGACG ACGATTGAAA ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

75 

AAGAATTGAA TAGTGTTTTT AATTTTCATC ATCTAAAGGT TGATTATGTT GATGGTGAAA 1314 0 

AGTGGACAAA TGCGAgcTTG nATTTTCATA AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

GAGGTATTTA TGACGGTGGC GGATGGAACG CGATTTTCTG GTGTAATCAT GATCAGCCAC 13 260 

20 

GGGTAGTGTC TAGATTTGGT GATGATACGT CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13 320 

TGTTAGCTAT CGCACTGCAT ATGTTGCAAG GGACGCCATA TATTTACCAA GGTGAAGAAA 13 3 80 

2S TTGGTATGAC GGACCCACAT TTTACATCAA TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC ATG CTGAAGC GGATGTGTTA GCGATTTTAG 13 50 0 

GACAGAAGTC ACGAGACAAT TCGAGAACGC CTATGCAATG GAGTGATGAT GTTAATGCTG 13 560 

30 GATTTACAGC TGGTAAn CCT TGGATTGATA TTTCGGAAAA TT AT CAT CAG GTCAACGTTA 13 620 

GACAAGCACT TCAGAATAAA GAGTCTATTT TCTATACGTA TCAAAAATTA ATACAATTAA 136 80 

GACATACGCA TGATATTATT ACGTATGGAG ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

35 ATTTATTTGT— TT ATGAACGT— CATTATAAGA— ATGAAGAATG - GGTAGTAATT - GGGAATTTCT 1-3800- 

CAGGATCGGC TGTTGATTTG CCAGAAGGAT TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13 860 

CAGGCACAGT GGAAAATAAT ACGATAAGCG GGTTTGGTGC AATTGTAATC GAAACAAACG 13 920 

40 

CGTAAAATAA ATTGAGTGGA TGCGTTTATA TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13 980 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG G& CAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CTGAACATGA TTTGGTGCAA TTGTACCAGT CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

45 

ATTTGTTGGC ATT AGACGG C ATGATTCAAA AGATTCATGG TAAAGGGTCA CTTGTCATTT 14160 

ATCAGGAGGT TACAGAGTTT CGATTTTCTG AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14 220 

AAATGGGCGT CGCATATTTA ACTGAAGTTG TTGTGAATGA GGTTGTTGAA GCGCATGAAG 1428 0 

50 

TTCCAGAAGT TCAACATGCT TTAAACATCA ATTCTAGTGA ATCACTCATT CATATTGTTA 14340 
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TTGTTTCAGA TATAGGTAAT GATGTTGCGA GTGATTCTAT TTATGATTAT 
TATTAAATCT TAATATTAGT TATTCAAGTA AGTCTATTAC TTTTGAACCG 
AAGCATATCA ATTGTTTGGT GATGTATCGG TGGCTTATTC AGCAACAGTT 
TGTATTTAGA AAATACAATG CCGTTTCAAT ATAATATTTC AAAACATCTT 
TTAAATTTAA TGACTTCTCA AGACGTCGTA TAAAGTAAAC AATGATATAA 
CTTGCAATTA ACTATTAAAA TATAGTAATA TATATCTTGC CGTGCTAGGT 
CGGTTCCCTG TACTCGAAAT CCGCTTTATG CGAGGCTTAA TTCCTTTGTT 
TTTTGCGAAG TCTGCCCAAA GCACGTAGTG TTTGAAGATT TCGGTCCTAT 
CCCATGAACC ATGTCAGGTC CTGACGGAAG CAGCATTAAG TGGATCATCA 
AGGgTAGCCG AGATTTAGCT AACGACTTTG GTTACGTTCG TGAATTACGT 
GGTGCACGGT TTTTTATTTT TTAAATATTA AACCGATTAT TAAGAGTTGA 
TTATAGAAGC TACTTTCTTG AAGACAATTC AGCGTATTAT ACGTGGAACA 
AAGTAGCTTT TTTATATGTG AAGTTTGATT CAAGTGAACT CGATGTGCAG 
TTTTGTGTCA ATGAAAAGTA AGAAGTTATA ATTTGATGAT AAAGAAATGA 
AGGGGGAGTA TCTTACAATA GAATTATTAA TGAGATACGT TATGATTATT 
TGCCTACGGA GGACATATGC AAATATATTT AAGTACTTTA ACAGAGTTAG 
ATCTTTAAAT AGTATTGAAG AAAGTTTTGA TGATAATCCT GAAACGAGTT 
TGCGAAAGTA AAACATTTAA GAAAATCTCC TTGCTATAAT TTTGAATTAG 
GAAAAATGAA AATAACGATG TCGTTGGACA CGTTTTATTA ATTGAAGTAG 
TGATGATAAG ACGTATTATG GTTTGGCGAT TGCCTCTTTA TCAGTTCATC 
TGGACAAAAA TTAGGTCGTG GCTTGGTTCA AGCAGTAGAA GAGCGTGCCA 
GTATAGTACG GTTGTTGTAG ACCATTGTTT TGACTACTTT GAAAAGTTGG 
TGCTGCTGAG CATGACATTA AATTAGAATC TGGTGATGCA CCGTTACTTG 
ATGGGATAAT TTGACGGATG CACCACACGG AATCGTAAAA TTTC CAGAAC 
ATTGTTCAAT TAAGAAGTAA AGGTATTAT C ATGCTATAAT GAGAGGTAAT 
GGTGCTAACT TGAATTATCA AGCCTTATAT CGTATGTACA GACCCCAAAG 
GTCGTCGGAC AAGAACATGT CACGAAGACA TTGCGCAATG CGATTTCGAA 
TCGCATGCTT ATATTTTTAG TGGTCCGAGA GGTACGGGGA AAACGAGTAT 
TTTGcTAAAG CAATCAACTG TCTAAATAGC ACTGATGGAG AACCTTGTAA 



TTGGAAAAGG 
TTTGATGAAC 
CGAAGTATTG 
GCAAATGAAT 
ATGATTTATA 
GGGGAGGTAG 
GAGGCCGTAT 
GCAATATGAA 
TATGTGCCGT 
TCGATGCTTA 
AAATATATAA 
TGTTTGTGGG 
TTTGAATGAT 
TGGTGAAATG 
GACAATCAAA 
ATTATGATAA 
GGCAAGCACG 
AAGTAATAGC 
AAATTAATAG 
CTGAATTACG 
AAGCACAAGA 
GTTATCAAAA 
TAAAATATTT 
ATTTTTATTA 
TGTTTATGGA 
TTTOGAGGAT 
AGAAAAACAG 
TGCCAAAGTG 
TGAATGTCAT 



14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
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AATAATGGCG TTGATGAAAT AAGAAATATT AGAGACAAAG TTAAATATGC ACCAAGTGAA 16260 

TCGAAATATA AAGTTTATAT TATAGATGAG GTGCACATGC TAACAACAGG TGCTTTTAAT 16320 

GCCCTTTTAA AGACGTTAGA AGAACCTCCA GCACACGCTA TTTTTATATT GGCAACGACA 16380 

GAACCACATA AAATCCCTCC AACAATCATT TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 15440 

ATTAG CCTAG ATCAAATTGT TGAACGTTTA AAATTTGTAG CAGATGCACA ACAAATTGAA 16 500 

TGTGAAGATG AAGCCTTGGC ATTTAtcgCT AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16560 

TTAAGTATTA TGGATCAGGC TATTGCATTT GGTGATGGTA CGTTAACATT G CAAGATGCG 16620 

TTGAATGTCA CAGGTAGCGT ACATGATGAA GCGTTGGATC ACTTGTTTGA TGATATTGTA 16680 

CAAGGTGACG TACAAGCATC TTTTAAAAAA TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

GTGAATCGCC TAATAAATGa TATGATTTAT TTTGTCaGAG ATACGATTAT GAATAAAACA 16800 

TCTGAGAAAG ATACTGAGTA TCGAGCACTG ATGAACTTAG AATTAGATAT GTTATATCAA 16860 

ATGATTGATC TTATTAATGA TACATTAGTG TCGATTCGTT TTAGTGTGAA TCAAAACGTT 16 920 

CATTTTGAAG TGTTGTTAGT AAAATTAGCT GAGCAGATTA AGGGTCAACC ACAAGTGATT 16980 

GCGAATGTAG CTGAACCAGC ACAAATTGCT TCATCGCCAA ACACAGATGT ATTGTTGCAA 1704 0 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

CCTGTTCAAA AATCTTCGAA AAAGCCTGCG AGAGGCATAC AAAAATCTAA AAATGCATTT 17160 

TCAATGCAAC AAATTGCAAA AGTGCTAGAT AAAGCGAATA AGGCAGATAT CAAATTGTTG 17220 

AAAGATCATT GGCAAGAAGT GATTGATCAT GCCAAAAATA ATGATAAAAA ATCACTCGTT 172 80 

AGTTTATTGC AAAATTCGGA ACCTGTGGCG GCAAGTGAAG AT CACGTACT TGTGAAATTT 1734 0 

-GAGGAAGAGA-TGCATTGTGA-AATGGTGAAT— AAAGAGGACG-AGAAACGTAG-TAGTATAGAA J.7-400- 

AGTGTTGTAT GTAATATCGT TAATAAAAAC GTTAAAGTTG TTGGTGTACC ATCAGATCAA 17460 

TGGCAAAGAG TTCGAACGGA ATATTTACAA AATCGTAAAA ACGAAGGCGA TGATATGCCA 17520 

AAGCAACAAG CACAACAAAC AGATATTGCT CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ACTGTACATG TGATAGATGA AGAGTGATAC ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

AAAGAAACAT CATTT T ATTG ATAAATATTT ATTG A TTTTC AAGGAGGAAA TGGAATATGC 17700 

GCGGTGGCGG AAACATGCAA CAAATGATGA AACAAATGCA AAAAATGCAA AAGAAAATGG 17760 
CTCAAGAACA AGAAAAACTT AAAGAAGAGC GTATTGTAGG AACAGCTGGC GGTGGCATGG 17820 
TTGCAGTTAC TGTAACTGGT CATAAAGAAG TTGTCGACGT TGAAATCAAA GAAGAAGCTG 17880 
TAGACCCAGA CGATATTGAA ATGCTACAAG ACTTAGTGTT AGCAGCTACT AATGAAGCGA 17940 
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TCCCTGGaAT GTGATCATAG ATGCATTATC 
TTATGAAATT GCCAGGCATT GGTCCAAAGA 

5 ATATGAAAGA AGACGATGTT GTTCAGTTTG 

TAACATATTG TAGCGTATGT GGTCACATTA 
ATAAGCAAAG AGATCGTTCA GTTATTTGTG 

10 TGGAAAAAAT GAGAGAATAC AAAGGTTTAT 

TGGATGGCAT TGGACCAGAA GATATTAATA 

ATGAAGTTAG CGAATTAATC TTAGCTATGA 

75 

TGTATATTTC TAGATTAGTT AAG CCTATAG 

TATCGGTAGG TGGCGATTTA GAGTATGCTG 

GTAGAACAGA AATGTAATkT CTTCTATTAA 

20 

AAGTCACAGT GTAATCATTG TGGCTTTTTT 
GCGGTGTGGC GGTGGTATGG TTTACCTAGT 
CAAGCCGTTG GTTGTGATTT GTTACTTCTA 

25 

TAGATCTATG GTTATGGTGT GTTGGTGCTA 
CAAATGAAAT TCTTTTGTAA TTGAAATGAT 

3Q GGTCTAAAGC TTATTAAATC AGCCTGTATA 

TAAATTTATT TTTAATTTCT GGTAAAAAAA 
ATATGGTTAG AGAAAAATCT GTTTCTTGTT 

3S TTTTTAAGTT CGATTTTTAG GATAAGGGCG 

ACTGTTGTTA AGCAGTTTGA AAGCCTGTAT 
CTCAACTTAA GAAATAACTT GAATTACTAA 

40 AAATGTTAAT AAAATGTATA ATTAATTCTT 

AATGACAA^A TGTCAACGTT AATTCCAAAA 
GTATTTATGA GCTAATCAAA CATCATAATT 

45 GAACGCTGGC GGCGTGCCTA ATACATGCAA 

CTGATGTTAG CGGCGGACGG GTGAGTAACA 

ACTTCGGGAA ACCGkAGCTA ATACCGGATA 

SO 

AGACGGTCTT GCTGTCACTT ATAGATGGAT 
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CAGAACCTAT ATCAAAACTT ATTGATAGCT 18060 

CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

CCAAAGCATT AGTAGATGTT AAGAGAGAAT 18180 

CTGAAAATGA TCCATGTTAT ATTTGTGAAG 18240 

TTGTGGAAGA TGACAAAGAT GTCATAGCTA 183 00 

ATCACGTTTT ACATGGGTCT ATTTCGCCTA 183 60 

TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

ACATTTTTGA TTTTAATACT ATAGTAAGAA 186 6 0 

TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

ATAGTAATGA TGTGAATTGG ATTATCGAAT 18840 

TTAATTTGAT AAATGCGGTT AATGACTATG 18900 

AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18960 

GCGGTGTTTT GAGAGATTAT TTAAAACTTG 19020 

TAACGTTCTG TTTTGCGTTT TTTTTGATTG 190 80 

CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

GTCGGTAAGA AAAATGAACA TTGAAAACTG 19380 

AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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_35_ 



GAGACACGGT 
gCtGaCGGAG 
GGGAAGAACA 
GGCTAACTAC 
TGGGCGTAAA 
GTGGAGGGTC 
GTAG CGGTGA 
TGTAACTGAC 
CCACGCCGTA 
AACGCATTAA 
GGGGACCCGC 
CAAATCTTGA 
GACAGGTGGT 
CGAGCGCAAC 
GTGACAAACC 
TACACACGTG 
CATAAAGTTG 
CTAGTAATCG 
CGTCACACCA 
CGTCGAAGGT 



CCAGACTCCT 
CAACGCCGCG 
TATGTGTAAG 
GTGCCAGCAG 
GCGCGCGTAG 
ATTGGAAACT 
AATGCGCAGA 
GCTGATGTGC 
AACGATGAGT 
GCACTCCGCC 
ACAAGCGGTG 
CATCCTTTGA 
GCATGGTTGT 
CCTTAAGCTT 
GGAGGAAGGT 
CTACAATGGA 
TTCTCAGTTC 
TAGATCAGCA 
CGAGAGTTTG 
GGGACAAATG 



ACGGGAGGCA 
TGAGTGATGA 
TAACTGTGCA 
CCGCGGTAAT 
GCGGTTTTTT 
GGAAAACTTG 
GATATGGAGG 
GAAAgCGTGG 
GCTAAGTGTT 
TGGGGAGTAC 
GAGCATGTGG 
CAACTCTAGA 
CGTCAGCTCG 
AGTTGCCATC 
GGGGATGACG 
CAATACAAAG 
GGATTGTAGT 
TGCTACGGTG 
TAACACCCGA 
ATTGGGGTGA 



GCAGTAGGGA 
AGGTCTTCGG 
CATCTTGACG 
ACGTAGGTGG 
AAGTCTGATG 
AGTG CAGAAG 
AACACCAGTG 
GGATCAAACA 
AGGGGGTTTC 
GACCGCAAGt 
TTTAATTCGA 
GATAGAGCCT 
TGTCGTGAGA 
ATTAAGTTGG 
TCAAATCATC 
GGCAGCGAAA 
CTGCAACTCG 
AATACGTTCC 
AGCCGGTGGA 
AGTCGTAACA 



ATCTTCCGCA 
ATCGTAAAAC 
GTACCTAATC 
CAAGCGTTAT 
TGAAAGCCCA 
AGGAAAGTGG 
GCGAAGGCGA 
GGATTAGATA 
CGCCCCTTAG 
TGAAACTCAA 
AGCAACGCGA 
TCCCCTTCGG 
TGTTGGGTTA 
GCACTCTAAG 
ATGCCCCTTA 
CCGCGAGGTC 
ACTACATGAA 
CGGGTCTTGT 
GTAACCTTTT 
AGGTAGCCGT 



ATGGGCGAAA 
TCTGTTATTA 
AGAAAGCCAC 
CCGGAATTAT 
CGGCTCAACC 
AATTCCATGT 
CTTTCTGGTC 
CCCTGGTAGT 
TGCTGCAGCT 
AGGAATTGAC 
AGAACCTTAC 
GGGACAAAGT 
AGTCCCGCAA 
TTGACTGCCG 
TGATTTGGGC 
AAGCAAATCC 
GCTGGAATCG 
ACACACCGCC 
AGGAGCTAGC 
ATCGGAAGGT 



19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
_21_000_ 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 



40 



45 



SO 



GCGQCTGGAT 
ATAACGTGAC 
TAAAGTGATA 
TACATTGAAA 
AAAGAGTTTT 
CACAAGATTA 
TGACTTATAA 
GGCACTAGAA 
AGCTTTGATC 



CACCTCCTTT 
ATATTGTATT 
TTGCTTATGA 
ACTAGATAAG 
AAATAAGCTT 
ATAACGCGTT 
AAATGGTGGA 
GCCGATGAAG 
CAGAGATTTC 



CTAAGGATAT 
CAGTTTTGAA 
AAATAAAGCA 
TAAGTAAAAT 
GAATTCATAA 
TAAATCTTTT 
AACATAGATT 
GACGTTACTA 
CGAATGGGGA 



ATTCGGAACA 
TGTTTATTTA 
GTATGCGAGC 
ATAGATTTTA 
GAAATAATCG 
TATAAAAGAA 
AAGTTATTAA 
ACGACGATAT 
AACCCAGCAT 



TCTTCTTCAG 
ACATTCAAAT 
GCTTGACTAA 
CCAAGCAAAA 
CTAGTGTTCG 
CGTAACTTCA 
GGGCGCACGG 
GCTTTGGGGA 
GAGTTATGTC 



AAGATGCGGA 
ATTTTTTGGT 
AAAGAAATTG 
CCGAGTGAAT 
AAAGAACACT 
TGTTAACGTT 
TGGATGCCTT 
GCTGTAAGTA 
ATGTTATCGA 
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GAGGAAGAGA AAGAAAATTC GATTCCCTTA 
ACCAACAAGC TTGCTTGTTG GGGTTGTAGG 

5 TTAGACGAAT CATCTGGAAA GATGAATCAA 

TGTCTCTCTT GAGTGGATCC TGAGTACGAC 
AGGACCATCT CCTAAGGCTA AATACTCTCT 

10 GAAAGGTGAA AAGCACCCCG GAAGGGGAGT 

GTAGTCAGAG CCCGTTAATG GGTGATGGCG 

GATTTGATGC AAGGTTAAGC AGTAAATGTG 

75 

CGTTTAGTAT TTGGTCGTAG ACCCGAAACC 

CAGGTAACAC TGAATGGAGG ACCGAACCGA 

GGGTAGCGGA GAAATTCCAA TCGAACCTGG 

20 

GGGCTAGCCT CAAGTGATGA TTATTGGAGG 
CGGGTT AC CG AATTCAGACA AACTCCGAAT 
TGGGTGATAA GGTCCGTGTT CGAAAGGGAA 

25 

ATATATGTTA AGTGGAAAAG GATGTGGCGT 
GCAGCCATCA TTTAAAGAGT GCGTAATAGC 
GTACCGGGGC TAAACATATT ACCGAAGCTG 

30 

CGTTCTAAGG GCGTTGAAGC ATGATCGTAA 
CCGGTGTGAG TAGCGAAAGA CGGGTGAGAA 

3S AGGAAGGCTC GTCCGCTCTG GGTTAGTCGG 

TGGATAACAG GTTGATATTC CTGTACCACC 
tAGGATAGGC GAAgcGTGcG ATTGGATTGC 

40 AAATCCGGTA CTCGTTAAGG CTGAGCTGTG 

TTGATTTCAC ACTGCCGAGA AAAGCCTCTA 
GACACAGGTA GTCAAGATGA GAATTCTAAG 

45 GGCAAAATGA CCCCGTAACT TCGGGAGAAG 

GCCGCAGTGA ATAGGCCCAA GCGACTGTTT 
AGGTGATGTA TagGGcTGAC GCCTGCCCGG 

SO 

CTGCGAAgCT ACGAATCGAA GCCCCAGTAA 
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GTAGCGGCGA GCGAAACGGG AAGAGCCCAA 21660 

ACACTCTATA CGGAGTTACA AAGGACGACA 21720 

AGAAGGTAAT AATCCTGTAG TCGAAAATGT 21780 

GGAGCACGTG AAATTCCGTC GGAATCTGGG 21840 

AGTGACCGAT AGTGAACCAG TACCGTGAGG 21900 

GAAATAGAAC CTGAAACCGT GTGCTTACAA 21960 

TGCCTTTTGT AGAATGAACC GGCGAGTTAC 22020 

GAGCCGTAGC GAAAGCGAGT CTGAATAGGG 22080 

AGGTGATGTA CCCTTGGTCA GGTTGAAGTT 2214 0 

CTTACGTTGA AAAGTGAGCG GATGAACTGA 22200 

AGATAGCTGG TTCTCTCCGA AATAGCTTTA 222 60 

TAGAGCACTG TTTGGACGAG GGGCCCCTCT 22 3 20 

GCCAATTAAT TTAACTTGGG AGTCAGAACA 223 80 

ACAG CCCAGA CCACCAGCTA AGGTCCCAAA 22440 

TGCCCAGACA ACTAGGATGT TGGCTTAGAA 22 500 

TCACTAGTCG AGTGACACTG CGCCGAAAAT 22560 

TGGATTGTCC TTTGGaCAAT GGtAGGAGAG 22620 

GGACATGTGG AGCGCTTAGA AGTGAGAATG 22 6 80 

TCCCGTCCAC CGATTGACTA AGGTTTCCAG 22740 

GTCCTAAGCT GAGGCCGACA GcGTAGGCGA 22800 

TATAATCGTT TTAATCGATG GGGGGACGCA 22860 

ACGTCTAAGC AGTAAGGCTG AGTATTAGGC 22920 

ATGGGGAGAA GACATTGTGT CTTCGAGTCG 22980 

GATAGAAAAT AGGTGCCCGT ACCGCAAACC 23 040 

GTGAGCGAGC GAACTCTCGT TAAGGAACTC 23100 

GGGTGCTCTT TAGGGTTAAC GCCCAGAAGA 23160 

ATCAAAAACA CAGGTCTCTG CTAAACCGTA 23220 

TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT 23280 

ACGGCGGCCG TAACTATAAC GGTCCTAAGG 23340 
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TGTCTCAACG AGAGACTCGG TGAAATCATA GTACCTGTGA AGATGCAGGT TACCCGCGAC 
AGGACGGAAA GACCCCGTGG AGCTTTACTG TAGCCTGATA TTGAAATTCG GCACAGCTTG 
5 TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GGAGACAGTG 
TCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 

10 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG GCATAAGGGA GCTTGACTGC 
GAGACCTACA AGTCGAGCAG GGTCGAAAGA CGGACTTAGT GAT CCGGTGG TTCCGCATGG 
AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 

15 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATCGCATC CTGGGGCTGT 
AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 
CGTCGTGAGA CAGTTCGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 

20 

CTTAGTACGA GAGGACCGGG ATGGACATAC CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 
ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 

2S CCTCAAGATG AGATTTCCCA ACTTCGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 
AATCAAAATA AATGTTTTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 

30 ATAAATTACA TTCATATGTC TGGTGACTAT AGCAAGGAGG TCACACCTGT TCCCATGCCG 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTcGAA CTTACGTTCC GCTAGAGTAG 
AACGTTGCCA GGCAAAAAAT GGATGCGATG AGCCGCATTG AGACCGCAAG GTCTCTTTTT 

35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA 



40 



45 



50 



AAACDTTTGA 


ATCTGACGAA 


ACGAGAAAAG 


ArCGCAACGA 


GTTTAGTAGA 


GCTAAATGAG 


24660 


TAAGyGAGAG 


CCG AAGrAGA 


GGAAAGAAGC 


AAGCGATTGT 


CACAAGTCAA 


GAAAGGTTCT 


24720 


TAGCGAsGAT 


GGTAGCCAAC 


TTACGTTCCG 


CTAGAGTAGA 


ACTGGAAATG 


ATAATTTAAT 


24780 


AATGTACACT 


TTCGATTGTC 


TAAGTATGTA 


CAACTTTAAT 


TTTGTGTTTA 


TATAAATTTA 


24840 


AAATGATATC 


ATCGAAAACA 


AAATATTGTA 


TAAATAGAGA 


AGAGCAGTAA 


GACGGTATCT 


24900 


AATTGAAAAT 


GATCTTACTG 


CTCTTTTATA 


TACTTTATTG 


AAATACAAAA 


AGGAAATTAA 


24960 


TTATTATACA 


ATAGACAAGC 


TATTGCATAA 


GTAACACTAA 


CTTTTATCAA 


AGAAGTGTTA 


25020 


CTTTATAATT 


AATGATTTTA 


TTAGAGCGTC 


TACATGCGGT 


TTTAAAGCAT 


CATCGTCTAT 


25080 


ACCGCCAAAG 


CCTAATATAA 


ATTTAGGGGT 


TTTCTTATAG 


TCTTGATCAT 


CATCAAAATT 


25140 
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23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT ACCCGTTTCA GCACCTTGAA TATCAAGCTG 25260 

CTCTTTGTAA GGTTTCAATC TTTTTAAAAT ATAGGTTAGT TTTCTACGAT AAATTCGTCT 25320 

CATTTTATTT AAATGCCTTT CAAAACCACC GGAAGATATA AACGTTGCAA TAAGGTTTTG 253 60 

CATATGAACA GGTACAGTGT TGCCTTCAAT GTGATTTTGA GAATGATATT TTTTCATTAT 2 5440 

AGAATAGGGT AACAC CAT AT ATGCAACTCG ACAGCTAGGA AAAATAGACT TTGAAAATGT 25500 

ACTGATATAA ATCACTTTTT CTCCTCTTGA ATATAGACCT TGAATTGCTG GAATGGGTTT 2 5560 

GCCGAAATAT CTAAACTCGG AATCATAATC AT CTTCT AT A ATAAATCGTT CTTCTTTTTC 2 562 0 

TTGAGCCCAT TGTATTAATT GAGTTCGTTT TTTTAAGTCC ATCACATATC CAGTTGGAAA 2 56 BO 

TTGATGGGAA GGCGTTATAT ATACTATATT TTTTTGTGAT TTAATAACTT CATCTACGTT 2 574 0 

TATTCCATTA TCTTCAACTT CAATTTGTTC ATATTCAACT TGTTTTTTAT CTAAAATATT 258 00 

TTTGATTGGT GGATAACTAG GTTTTTCGAT AATAAATGTT GAAGTATAAA GTAAATCGAC 2 586 0 

TAATTGATTT ACTAATTGTT CGGTAGATGA G C CAATTAT A ATTTGATTAG GATCACAAAT 25920 

TACGCCACGA TTAGTAAATA AATAAAATGC CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25980 

AAAATGTCCT CTACGTAATT GATTTAAATG ATTTGTATCA TAAAGATCTT TGGAATACTT 26 040 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT ATCTATTTCA TCCAAATTAA AAGCATAATC 26100 

ATAAGCTTCA TCACTCGCTT TTGGTTTATA TGAATCATCA TCAAAAAGAG AGGGGATAGG 2 6160 

TTGATTGTTT AAAATTGTTA AAGATTCAAT TTCGGACACA AAATATCCAG AGCGAGGTCT 2 6220 

TGAATAAATG TAACCTTCGT CTAATAGAAG TTGATATGCA TGCTCTACGG TTGTTTGGCT 26280 

AATAGATAAA TGTTTGCTTA ATTGTCTTTT AGAATAAAAT TTATCGCCTT CTTTAAATTG 26340 

35 ACCTTCAATT ATTTGTTTTT TTAATTTTTC ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 2 64 00 

TTTTATAACT GACCTCCTAA ATTTATCTTA TTTTGTACCT TTTTAAATAT CAGTTTATAC 2 6460 

ATTACAATGT ATTTAATCAA CTTGAAAAGG GGTTTTATGT ATAATGAGTA AAATTATTGG 26520 

40 ATCAGACAGA GTCAAAAGAG GTATGGCTGA AATGCAAAAA GGCGGCGTTA TTATGGATGT 26580 

CGTTAATGCT GAGCAAGCAA GAATTGCAGA AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

AGAACGAGTA CCTTCTGATA TTAGAG CTGC TGGTGGTGTT GCACGTATGG CAAACCCTAA 26700 

AATTGTAGAA GAAGTAATGA ATGCTGTTTC TATTCCAGTC ATGGCTAAAG CACGTATTGG 26760 

TCATATCACT GAAGCAAGAG TATTAGAGGC GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

AGTGTTAACA CCAGCAGATG AGGAATATCA CTTAAGAAAA GATCAATTTA CAGTACCATT 26880 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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w 



15 



20 



25 



30 



35 



ACAAGTTAAT 


TCAGAAGTTA 


GTCGATTGAC 


TGTAATGAAT 


GATGATGAGA 


TTATGACTTT 


27060 


TGCGAAAGAT 


ATCGGTGCGC 


CTTATGAAAT 


TTTAAAACAA 


ATTAAAGACA 


ATGGTCGTTT 


27120 


ACCGGTAGTT 


AACTTTGCAG 


CTGGTGGCGT 


TGCGACTCCT 


CAAGATGCTG 


CTTTAATGAT 


27180 


GG AATTAGGT 


GCTGACGGTG 


TATTCGTTGG 


ATCAGGTATT 


TTTAAATCAG 


AAGATCCAGA 


27240 


AAAATTTGCT 


AAAGCAATTG 


TTCAAGCAAC 


AACACATTAC 


CAAGACTATG 


AACTAATTGG 


27300 


AAG ATT AG CA 


AGTGAACTTG 


GCACTGCTAT 


GAAAGGTTTA 


GATATCAATC 


AATTATCATT 


27360 


AGAAGAACGT 


ATGCAAGAGC 


GTGGTTGGTA 


AGATATGAAA 


ATAGGTGTAT 


TAGCATTACA 


27420 


AGGTGCAGTA 


CGTGAACATA 


TTAGACATAT 


TGAATTAAGT 


GGTCATGAAG 


GTATTGCAGT 


27480 


TAAAAAAGTT 


GAACAATTAG 


AAGAAATCGA 


GGGCTTAATA 


TTACCTGGTG 


GCGAGTCTAC 


27540 


AACGTTACGT 


CGATTAATGA 


ATTTATATGG 


ATTTAAAGAG 


GCTTTACAAA 


ATTCAACTTT 


27600 


ACCTATGTTT 


GGTACATGCG 


CAGGATTAAT 


AGTTCTAGCG 


CAAGATATAG 


TTGGTGAAGA 


27660 


AGGATACCTT 


AACAAGTTGA 


ATATTACTGT 


ACAACGAAAC 


TCATTCGGTA 


GACAAGTTGA 


27720 


CAGCTTTGAA 


ACAGAATTAG 


ATATTAAAGG 


TATCGCTACA 


GATATTGAAG 


GTGTCTTTAT 


27780 


AAGAGCCCCA 


CATATTGAAA 


AAGTAGGTCA 


AGGCGTAGAT 


ATCCTATGTA 


AGGTTAATGA 


27840 


GAAAATTGTA 


GCTGTTCAGC 


AAGGTAAATA 


TTTAGGCGTA 


TCATTCCATC 


CTGAATTAAC 


27900 


AGATGACTAT 


AGAGTAACTG 


ATTACTTTAT 


T AAT CAT ATT 


GTAAAaAAAG 


CATAGCTTAA 


27960 


TGTATGCTAA 


ATCAACGAAT 


TATTGATATT 


TATAGATTTG 


TTGAGAAGAA 


AATATCTCCT 


28020 


TCAAACTTAG 


CTTTGGAGGA 


GTTATTTTTT 


ATGTCAAAAT 


TAAAAATGAT 


AAAAAATAAA 


28080 


GCTATACATA 


AGAAAAAAAC 


CCTTCAAAGA 


GACTGAGAAT 


AGTCAAAATT 


TTGAAGGGGT 


28140 
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TAATTCGATG 
ATACTAGTGT 
TGTCATTAAG 
ATACTAAGAA 
GTGCTTGCAT 
GAACTGCATC 
TTAATGTTTG 
CAATTCCATT 
CAGCTACTTT 
TTTCTTCAGT 



TTGATGTATT 
tGCACCGAAT 
TGATTTAATC 
TACAGATGTA 
TGCTACAAAT 
TTGCCATGGC 
GAAATCCCAA 
TAATAGAGCG 
AAATCCATCT 
TTCTTCAACT 



TGTTAAATAA 
AATAATTTCA 
GCACCTGAAA 
ACACCTTTTG 
TCGTTAGATA 
ACACCGACTA 
GAAATAGCGC 
ATAATGGCAA 
AAAATATATT 
AATAATTTGT 



AGAATCcAGC 
AACCAAAGCG 
TAATACCGAT 
CGTGTTCAGA 
ATAGTTTTGT 
AGAATGCAAA 
CACCTGAAAC 
TGTATCCGAT 
CTCCTAGCAT 
CATCTTCTTC 



GATTGCAGCT 
GGCAACTGTA 
AGAGCTAAAG 
TAAATCACTA 
CGCCATAACT 
TGGTGCAAAG 
TGTACTAAAG 
TAACATTGCG 
TTCGAAGAAT 
ATTAACTTTA 



GAAATGAAAG 
TCTCCTTTTT 
TTAGCAAATG 
AGTTTACCAA 
GAACCGGCTT 
ACAAAACCAA 
ATATTGCTTA 
CCTACAATGA 
GATTGTTGTC 
TAAGGGTTAA 



28200 
28260 
28320 
28380 
28440 
28500 
28560 
28620 
28680 
28740 
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TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC AGAAACAGTC GACATTGCTG 28860 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 28920 

TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 28980 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 2 904 0 

TTGTGTATTG AAGGATACCG ATAATCG CTG AAATAAATAC GATAGGTAAT AATACACTGA 29100 

AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 2 9160 

C AT CTG CTG C TTTTAATAAT AAGTAGTTAA AACCGTTTGA AATACCACCA ATAACCTTGA 2 92 20 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 2 9280 

CTACATATTT CCAGCGAATA TTTTTCCTGT CTG AG CT AAA TAGAAACGCA AGTGCTAAAA 29340 

AGAAGATAAT TCCGATAATC C CAATT AGAA T ATG CAT AT A TTTCTCATTC CTTTAGTTTT 29400 

20 TTCTACaATc TAT CAT ACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 29460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA TATAGTCAAA GAAGGT CAAA 29520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 295 8 0 

25 TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 29640 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 29760 

AG ATG CAACA GGTTATATTA ATCATTTGCT T CAGCTG ATT GGACCTTCTA TTTCTCAACA 29820 

ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAGC 298 80 

TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 29940 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 30000 

ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 30060 

AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 30120 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACATCAAG ATAATTTCGA 30180 

AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 30240 

45 TCAAGA 30246 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TATTCCCCCA TCGGTTTATT AAATCGTCCA TTTCAATACT GTTTTTCCCC AAGATGTCGA €0 

TAAATCCATT TCAAACGCTT GGACGATATC TTGCATCGTA CATACATTAA TTTCATGTCC 120 

TTTTAATAAT GCTAACTTTT CAACTATGTC TGGGTACTTA CGATATAAAT CAACAACTTG 180 

CTCAAAATCT TTAGAGCCGC TTCGACTACT ACCAATCAAC GTTAATCCTT TTTCAAGTAC 24 0 

TAATCGTGTA TTCACTTCCA CGGGTAATTC ACTTACGCCT AACAAAGCAA TACTGCCTTC 3 00 

TGGTGAAATA TGTTCAACTA TTTGTTGAAG TGCAACTTGA CTTCCTTTAC CTCCAACACA 360 

TTCAAATGCA TGATCAATTT TAAGATCATC TGGTATTTGA TTTACTGTAA AGATGTCATC 4 20 

TACAAATGAA AAATGACTTA ATTTATAGTC TGTCTTACCA AATACATAAG TTTTAGCTTC 4 80 

TGGGTACAAC TTACGTAGCA AAATAGCAGT AATATAACCT AAGTT AC CAT CACCCCAAAT 540 

20 ACCAAAGCTG GTTTTCAAAG GTATAGATTT ACGTTCAAAT CGTTGTATAG CATGATAACT 6 00 

TACTGACACT AACTCTGTGT ATGAAATCGT ACTCAAATCA ATGTCATTAG GCAGCGGAAC 660 

GATACGATCA TGTGCCATCA CAACGTAGTC TTGCATAAAA CCATCATAAC CACTAGATCT 720 

AAAATAACTA GAGGCTAAGT AATTCTCCGC AATAATATGA TGTTG CTCTG TAGGTGTATT 7 80 

CGGTACCATT ACTACTTTCG TACCTTTTTC AAATACCCCT TTACTATCAA ATACAACTTC 84 0 

ACCAACAGCT TCATGAACTA ATGACATTGG TAATTTTTTG CGTAGTACAT TTTCATCTCT 900 

TCGACCTGTG TAATACCTTT GATCAGCTGC ACAAATAGAC AAGTATAAAG GTCTTACGAT 960 

GACATGATTA CCATAAATAT CAACATTATT ATATGTGACG TCGAACTGTC TCGGTGCAAC 1020 

GAGTTGATAT ACTTGATTAA TCATCGGCAA TATCACCTTG AATAATGGCA TTTGCTACTT 1080 

-35 

TTAAATCATA CGGTGTTGTC ACTTTAATGT TGTATAGTTC TCCaCGTACC AATTTAACTG 1140 

CAT<STCCAGA TTCGACAATG ATTTTACATG CATCTGATAA GATTTCTTTT TGTTCACTAC 1200 

40 TTAAGGCGCG ATAACTATCT TGTAATAATT TAATATTAAA TGATTGTGGT GTTTGGCCTT 1260 

GATACATTTC ATTCCTTACA GGGATACTGT , GTATGTTCTG TTTATCTTTA GACATTACAA 1320 

TCGTATCAAT TGCTTCAATG ACTGTATCTA CTGCACCATA TTTTGCTGCT ACTTCAATGT 1380 

45 TCTCTTTAAT AATACGTTGA GTTAAAAATG GTCTTACGGC ATCATGAGTT ACAATCACAT 1440 

CATCATTATT AATTCCATTT ACATTGCGAA TATGGTCGAT AATGTTCATA ATTGTTTCAT 1500 

TTCGATCCGT ACCACCTGCA ACTACTTTGA CACGTTGATC TGTAATGTTA TATTTTTTTA 1560 

SO 

AAATATCCTG TGTATGGGAA ATCCACTGTG CTGGCGTTGC GATAATAATC TCATTAAATT 162 0 

CACTCACTAA AATGAACTTC TCAATTGTAT GGATTAAAAT CGGTTTATTA TCAATATCTA 16B0 
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CTG CAT AAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 
ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA 
TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 
AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 
AATG TTT ACT CTTTTTCAAA TTCATTATTA CTG C CAT CAT TTTACCATAT ATTATAATAA 
ATTTATCTTA TTAAGTGGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTG CCAT 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 216 0 

1S ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 2220 

ACGGCATTCG CACTTTCATA GCTATAACTA T AC CAGCGTT TTCGTCCTCA AAGGTGCATA 22 80 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 234 0 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 24 00 

TTTCGTCTTC ATATAATGTA AGGTTGC CGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 2460 
ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 
ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 
TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

GATATGTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2820 

CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 28 80 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

40 GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 3060 

AGGTATTGCt GGGCAACCTG TCACTTTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC AT AC CTTT AA TGTATGGCCC 3180 

CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCG CTCGTA CTTTAATAAT 324 0 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 3300 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

AGTAAATGTT CCATATAAAA ATCAGTGATT TGTCTTAATG GCCCAAGGAT AAAAGTTAGC 34 80 
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TGCTTTAATA CCTTCGCCGG ATTTTAAATG TTGATACGCC TCGTCCCATT TCGAAATATC 3600 

ATATATTTTT GTCACCAAAG CTTCAGCATT TACT AAA CCA TCCGCCATAA GTTG CAATG A 3660 

5 AGGTTCCCAA TCTGCTGGCT TTTGACTTCT ACTACCAACA ACTGTTATTT CTTTTTGAAT 3720 

CACTTTTTCC ATATCAAATG GAATTTCAGC ATCCTTAAAA ATACCTATTT GACTGTAGAA 3780 

ACCTTTTTTG CGTAAAATAT CCAAACCTTG TCGTGCTGCT GGAACTGCAC CTGAACATTC 3840 

10 

AACAACAACA TCTGCACCGT AACCGTCTGT AATT CCATTG ATATACGTTT TTAAGTCTGT 3 900 

TTGTTGTAAA TTGACTACAT AATC CATGTG CAATG CTTCT GCTTTATCTA ATCTGACTTT 3 960 

GTCATTGTCC AATCCAGTTA CCACAACAGT TGCGCCTTTA CTTTTTAACA CTTGTGCTAC 4020 

75 

AAGTAATCCG ATTGGCCCAG GTCCCATTAC AACTGCTACA TCGCCTGAAT TGACTTGAAT 4080 

CTTAGAAACG CCATGATGTG CACATGCTAA TGGTTCTGTC ATAGCTGCAG ACTGATACGA 4140 

20 TAt TCGTCTG GAATATGATG CAAACTTTCT TCACGTGCAA TGACATAATT AGTAAATGCG 4200 

CCATCAACTT GTGTTCCAAT ACCTTTTCGA TGGTTGCATA AATTATAGTC TTTTGATTTA 4260 

CAGTATTCAC ACTCATTACA AACATAGAAT GTCGTTTCAG aTG t GACACG GTCACCAACT 4320 

25 TTAAAATCTT TAACGTCTGC TCCAACTTCA ACGATTTCAC CAGAAAATTC ATGACCTAAT 43 80 

GTCACTGGAA AATTAACTTT ATAATGACCT T CAT AAGT AT GAAT AT CTGT GCCACAAATT 4440 

CCTGCATAAT GTACTTTAAT CTTTACTTTA TCATCTAGCG GTGTTGCAAC TTCTTTATCA 4 500 

in 

AGAAGTTCTA AGTTGCCATG TCCTTCTCTT GTTTTTACTA AAGCTTTCAC CACAAACACC 4560 

TCGATTTTTA ATTGAATAGA CTAAATAGTT TAAAGATAAG ATAGTTAACG ATATTACCAC 4620 

CTTG AT CAAT ACTTGAAATT TCAGATGAAC CTTTTGGCAT TTGTACATTC GTACCTTTCG 4680 

-35 

C CAT ATCTGT GAAAATGGGT GCTACGTCTG TTGCAATATA TAGTGAAATT GCAATCATAA 4740 

TCGTACCCAC AATGACAGAA TGAATAATGT TTCCTCTTGC TGCACCAACA ATAAACGCGA 4800 

CAACAAATGG TATCGTTGCT AAGTCACCAA AAGGTAGTAC TTGGTTTCCT GGTAAAATAA 4860 

40 

CGGCTAATAA AACAGTGATA GGTACTAAAA TTAATGCTGT CGAAATAACT GCTGGATGAC 4 920 

CTAATGCTAC AGCCGCATCC AATCCAATAT AAATTTCACG TTCGCCAAAA CGTTTATTTA 4980 

45 GCCATGTTCT TGCAGACTCT GAAACTGGCA TTAAACCTTC CATTAAGATT TTTACCATTC 5040 

TAGGCATTAA TACCATTACT GCAGCCATTG ACATTCCTAA ATTAATGATG TCTCCAGGTT 5100 

TGTAACCTGC TAACACACCA ATACCTAAAC CTAAAATTAA GCCGACAAAT ATAGACTCTC 5160 

50 CAAATGCGCC AAAACGTTTT TGAATTGTTT CAGGATCAGC ATCTAACTTA TTCAGACCGG 5220 

GTACTTTTTG TAACAATTTA ACTAAGTAAA TACCTGGTGC ATAAGAAATT GTACTTCCTG 5280 
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CTACTTTCAA 


. ACAGATAATT 


TGGAAAATAA 


CTGCTGCTAA 


TAACGCTTGC 


CAAATACTGC 


5400 




CTGATACGGC 


ATAAACCATT 


GCTGCTGTAA ACGTATAATG 


CCAAAAATTC 


CAAATATCTA 


5460 




CATTCATCGT 


CTTTGTCACT 


TTAGTTACTA GCAATACAAC 


GTTAACTATG 


ATTCCGAGTG 


5520 




GAATAATAAA 


TGCTGCGACA 


GATGATGCCC 


AAGCGATAGA 


TGATGTTGCT 


GGCCAACCTA 


5580 


10 


CATCAATCAC 


ATTCAGACTG 


ACGCCTAAAT 


TTTTAACCAT 


CGCTTGTGCT 


GCTGGCCCTA 


5640 




AATTTTTAAC 


TAATAAATCG 


ATGACTAAGA 


AAATCCCTAC 


AAAAGCCACA 


CCTATTGTTA 


5700 




AACCAGACCT 


AAATGCCGCT 


CCAATTTTCT 


GCCTAAAGAA 


TAGGCCAAGC 


AAGAATATGA 


5760 


IS 


CAACCGGTAA AATAACAGTt 


GCACCTAAAT 


CTAAAAATCC 


CCTTACAAAA 


TCAGTGAAGT 


5820 




AACTCATATT 


TAAACCCTCC 


CTGTTATATA 


TGCATTGTCA 


CGATACTTTC 


CGATTGTGAT 


5880 




TACATTTGAC 


GTTACAGTCA 


TTTCAACGAC 


AACCCTTGCT 


AAATTCGACT 


GCAGTCCTTT 


5940 


20 


TGAATTACAG 


tCACTGCGTT 


TCTATGTCAT 


CAACAATCAT 


TTGTCGTGAT 


AGTCATTTAT 


6000 




ATGCAATTTG 


CAT AT ATT AA 


TATGTTATCG 


ACCCACGTTA 


CATATCAATT 


CCGTTATTTT 


6060 


25 


TGTAACTCTG 


TTAAGATTTG 


TTGTTTTGTT 


TCTTCAATAC 


CAATACCAGT 


TAAGAAATTA 


6120 


CGTGCGTTGA 


TAACTGGGAA 


TTTATATTCT 


ttttttgtca 


TTGCAGTTGT 


AACTAATAAA 


6180 




TCTGCAGTGT 


CTTCATAAGG 


TCCAACTTCT 


GTAATTTTGA 


TTTGTTTAAT 


AT CTACTTTA 


6240 


30 


ATATTGTGTT 


CCTTTGCCAT 


TTCTTCAATT 


GCATTATTTA 


CTACTGTTGA 


CGTTGCAATA 


6300 




CCTGCACCAC 


ACGCTACTAA 


TACTTGTTTC 


ATTTTCAATT 


CCTCCAATTA 


ATTTTTAGTT 


6360 




ATATTCCAAA 


TAATCATTGA 


TTAGTGTTGC 


TAAAATTGTT 


TCATCTTTCG 


TTCGTAGAAT 


6420 


35 


CTGCTCCAAT 


TTTTCTTCAC 


TTTGAAAAAT 


TTGCATCAAC 


TGTTGTAACA 


GCTTAAGTTG 


6480 




ATCATCTACT 


TTATCCATTG 


CTAACATAAA AACGATTTTC ACTTCTGTCT 


GTTGATCAAG 


6540 




TGTTCCCATT 


TCAATAAACG 


GCACTTCTTT 


TTCTAGAACA 


GCCACACCTA 


TCGTTCTATG 


6600 


40 


gttXatatgt 


TCGACATCTG 


TATGCGGTAT 


AGCGACCGAA 


CATAGATGCG 


TTGGTAAACC 


6660 




AGTAGCAAAT 


TCTTTTTCTC 


TGTCGATGAC 


XGCATCTTTA 


AACGTTGACT 


TCACGAACCC 


6720 




ATTTTGAAAT 


AACACATCTG 


ACATTTGTGA 


CAATACGGAT 


TCTTTATCAG 


TTGCCGACAA 


6780 


45 


ATTGAGCATT 


ATATTTTCTT 


TATGCACTAA 


TTGCTGTCCC 


ATCCATTTTC 


CCTCGCTTCT 


6840 




TTATTTGAAT 


AATTTTTTAA 


AATCTCATTT 


ACATCAGAAT 


TTTTGCGACT 


TTGTATGATG 


6900 


SO 


CGCTTAATTG 


CGTCATTGTC 


TTGCGCCACA 


TCTCTCAATT 


GTAGTAACGC 


TCTTAAGTGT 


6960 


GTCACTTTAT 


CAACAGCAGC 


AATAGGTACA 


ATAATATGGA 


TTGCTGTGCC 


ATCTGACATG 


7020 




TATATTGGTT 


CTTGTAATAT 


CAACATACTC 


ATCGCTGTTT 


TATGTACATG 


CTTTTCAGAG 


7080 
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10 



15 
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'35' 



40 



45 



SO 



TGCATCTCAT 


GAATATATTT 


AATATCAATA 


AAATGATTAG 


CAACTAACAC 


ATCACTTGCT 


7200 


TTAGCAATAG 


CTTCATCAAT 


ATTTTCAACA 


TGATGCATTC 


TTTTCACGTG 


CCTTGCCGGT 


7260 


ATCAAGTCAG 


CTAAATCTAA 


TGyCTwATTT 


tGTGtGACaA 


TCGATCCATT 


AATGGTTGAA 


7320 


ATTGAATTAT 


AATTGGCAAT 


AAAATCTTCT 


AAAC CATCAC 


GTAGTcTGTA 


ATGTCATTAA 


7380 


CTGTCGTTGT 


GCGTTCAATT 


AATGCCATTA 


ACTTGTTTAT 


TTCCTTATCA 


ATGTCAGCCG 


7440 


ATTCCTTATT 


AATGTACTTC 


ATCACTTCTT 


TACGTAACTT 


T CGTTGCT CA 


TTTTCAGATA 


7500 


AAGCTACTTT 


TGTGATAAAT 


AATTTTTTAT 


GTGTTAGGAC 


AAACATTGGT 


GAAAAGACGA 


7560 


TGTCATAATC 


TAATGTGTAA 


TTTTCAAATG 


TTCTAAGTGA 


AATCGCATCT 


AAGAAAATAA 


7620 


TTTCTGGAAA 


TAAGTTTCGC 


AACTCGTATA 


ACATCATTTG 


TGATACTGAC 


GTG CCTTGTG 


7680 


TACACACGAT 


AATAGCTTTT 


ATCTTGCCAT 


CGAAGTTTTC 


ATCTTGACGT 


CTCAAACTAC 


7740 


CTCCGAACAA 


CATGGTTAAA 


TATGCTATTT 


CATTATCAGG 


CAACGATTTT 


CCGAAATATT 


7800 


CAGTTAACGA 


TTGACATGAT 


TGTTTCACCA 


TATGAAATAA 


GGATTGATAA 


TTTCCTTGTA 


7860 


AAGGATTTAT 


TAATTCATCA 


CGATCCGTTA 


AGTTATATTT 


AATCCTATAA 


AAAGCAGGCG 


7920 


TTAAATGTAA 


CAAGAGTTGC 


TGTGATAATT 


TCTCCTTATC 


TTCAATGTTA 


ATAAAAGTGA 


7980 


TTTGTTCAAA 


ATGGTGAATC 


ATTTGAGCGA 


TGGCCATCGT 


TAAATTCGAT 


ATGCTATCTG 


8040 


ATTCTTGCAA 


ATCAGTCCAT 


TGCACACTTG 


TTGAAAGTAA 


GTGTAATGTC 


AAATATAACT 


8100 


TTTCCGCTTC 


TGGCAAATCC 


GGCTCATGTT 


GCGTCATAAT 


CTCCGTTGCT 


TGATATTCTT 


8160 


TCGTATCCCT 


CAAATACTGA 


TAATTAATAT 


TTAATGGATT 


CATCACATGA 


CCACTTTGAA 


8220 


TTCGTCTACG 


AATCACACAA 


AGGACATAAG 


GCAATGAACT 


AAGTGATTTG 


TCTATAAAGC 


8280 


GACTCTTCAA 


AAATTGTTCT 


AC CTGTTTG A 


TCTTGTCTTT 


TTGATATGCG 


ATATCTTCGA 


8340 


ATG&AAGTT 


GAGCGCCTTT 


AAAACTTCAC 


TTTTAGTAAT 


ATCATGATTC 


AACCTTTGAT 


8400 


CAATCAACTT 


AATGAAGAAA 


CGGCGAACTT 


CAAATTCATC 


ACCAACAATT 


T CAT AAC CAT 


8460 


GTTTTCGAGA 


ATACTTAAGT 


GACAAACCAT 


GATTTTCCAA 


TTGCTCTTTC 


ACATGATTTA 


8520 


TATCGTGAAT 


GACAGTATTT 


TTACTGACTT 


GTAAATCAAT 


TGAAAAATGG 


TTTAGAGACA 


8580 


TTGCGTTTTC 


CTTACTAAAA 


AGCATGAGCA 


TTAAATAATA 


ACGACGTGTT 


TCTATGCTAA 


8640 


AAATGACATT 


GTTGCCGTTT 


AACATTTGCT 


GCTCCGATAC 


ATCTCGCTTG 


AATAACGTCA 


8700 


TGATTTCAGA 


ACTTACAATA 


AAATTTCCTT 


GGCTTGTTCT 


TTCAAGTTTT 


GGATAACCCT 


8760 


CTTGTTCAAG 


CCACAAATTG 


attttttgaa 


, TGCGATATCC 


TAGTTGTCTA 


. CGAGACAAAC 


8820 


CAAATATCGA 


TTCAAGTTCT 


TTACCATGAA TAGTAGGATT 


' CAATACAATT 


' TCTCTGAGTA 


8680 
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50 



TCAATCGTCA 


CACCGATGTA 


CACACTTTGA 


ACACATATTT 


TCAAAATGAG 


CATGTACATC 


9000 


ATTGTGATGT 


TTTAACAACA 


TTTCAATTAT 


AT CT AT ATTT 


TTTGTGATTT 


TAATCTTTTA 


9060 


AAATAAAGCA 


ATTGAAATTT 


TTGCATATAT 


TTTTGTGTTT 


TGTGTTTTTT 


TGAAGCATTT 


9120 


TTAACATACA 


TATCTCAATC 


ATTATCAAAT 


TGTCATGACC 


ATTGTAACCC 


AATACAAAAA 


9180 


CCCTAAGGAC 


GCTTATATGA 


GGCGCCTTAG 


GGTTAACTGT 


AT CT ATTT AA 


TTAAGTATTA 


9240 


TTATTCGTAT 


GTACGTAACT 


TATGGTCTAT 


CAAGTTCCAC 


ACTTCTTCAA 


CATCAACTGC 


9300 


TGTAGCAAAA 


TAAGCATTGG 


CAGGCTTACC 


TGTAACATGA 


TTTAAATCGA 


CAGCCATAGT 


9360 


GC CATAAGTT 


AGTGGACTTT 


GATGTTCAAT 


GTCGATATTA 


ACGGGTACCA 


TTGTAAACAA 


9420 


TTCTGGTTGT 


AACAAATACA 


AAATTGTACA 


AGCATCATGT 


ATTGGACCAC 


CATCCATATT 


9480 


AAAGTGAGTC 


TTGTATGTCT 


TCTTAAAGAA 


TTGCAATAAT 


TCTACGACGA 


ACTGTGCAAC 


9540 


AGGATTATTG 


ATACTTTCAA 


AGCGTTCAAT 


CACGTGATCG 


TCGGCTAAAA 


CTTGATGTGT 


9600 


T ACAT CT AAA 


CCAAACACAT 


TTATAGTAAT 


CCCACTTTCA 


AAAACACGCT 


TCGCTGCTTC 


9660 


AGCATCTACC 


CAAATATTGA 


ATTCTGCTGT 


AGGCGTCCAA 


TTTCCAAATG 


TACCACCACC 


9720 


CATCAAAGTA 


ATAGATTCAA 


TATGCTCAGC 


GATTCTTGGC 


TCACGAATCA 


ATGCCGTTGC 


9780 


TACATTCGTA 


AGAGGACCTG 


TCGCTACAAT 


TGTTACAGGT 


GTATCACTCG 


TCATCACTTT 


9840 


GTTTATAATC 


ACATCTGATG 


CTGGCATTGC 


AACTGCTTGA 


CGTGATGGTG 


TCGACGGTAG 


9900 


TTTCGGACCA 


TCTAATC CAG 


ATTCCCCATG 


T ATTT CAG AA 


GCAAAGGCAG 


CTGGTTTAAT 


9960 


TAACGGCCTA 


TCCGCACCTT 


TCGCTACTGC 


TATATCTTGG 


CGTCCCATAA 


TATCCAATAC 


10020 


GTTCAAGGCG 


TTTGTCGTAT 


TCTTGTCAAC 


TGATTGATTA 


CCTGCGACTG 


TTGTTACAGC 


10080 


TAATATCTCT 


AGTGGACTGT 


CAATTGCCCC 


CGCTAAAATT 


AATGCTATTG 


CATCATCGTG 


10140 


TCCTDGATCA 


CAATCCATAA 


TAATCTTTCT 


TTTCATTTAT 


ATATCCACCT 


TTCTTAAGTT 


10200 


GTTATCGATA 


GCTTATGTAT 


ATTTATTTAT 


GTGGTGAATC 


ATGTTTATTT 


TGAAAAATAG 


10260 


TTTTAACTTT 


CTCATATTTT 


TGGATACAAA 


CACTATTTAT 


CTATTTTATG 


GCTTATAAAT 


10320 


TTATCCGATA 


TGCCTTATCA ACCTACCTCG 


CTAAAAATAG 


GATGTCTACA 


TATCTATACC 


10380 


GACTTTTGTC 


AACTCATTTT 


CACAACAATA 


TAAACAGCAA 


TTTATATGAT 


TGTTACATGA 


10440 


TTCAAACAAT 


TTTTATGAAA 


AATATTTTCA 


TACACAGAAT 


ATATATTGAT 


ATTAAATTTC 


10500 


TCAAAAGCTA 


TATTGAGAAT 


AATTAGGAGG 


GATGTTGATG 


AAATCTTTAT 


TTGAAAAAGC 


10560 


ACAG CAGTTC 


GGCAAGTCCT 


TTATGTTACC 


TATCGCAATC 


TTACCAGCTG 


CAGGTCTATT 


10620 


GTTGGGTATC 


GGTGGTGCAT 


TAAGTAATCC 


AAACACCGTT 


AAAGCATACC 


CTATTTTAGA 


10680 
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AAATTTACCG 


GTCATCTTTG 


CAATTGG TGT 


CGCAATCGGA 


TTATCTAGAA 


GCGATAAAGG 


10800 




TACTGCAGGT 


tTAGctGCGC 


TGCTCGGTTT 


CTTAATTATG 


AACGCAACTA 


TGAATGGCTT 


10860 


5 


ATTAACTATC 


ACGGGCACAT 


TGGCAAAAGA 


TCAGCTTGCA 


CAAAATGGAC 


AAGGCATGGT 


10920 




GCTCGGTATA 


CAAACGGTTG 


AAACCGGTGT 


TTTTGGCGGG 


ATT AT CACAG 


GTATTATGAC 


10980 


10 


CGCAATACTT 


CACAACAAAT 


ATCACAAAGT 


GGTATTACCA 


CCGTATTTAG 


GTTTCTTTGG 


11040 


TGGCTCTAGA 


TTTGTC CCTA 


TTGTCACAGC 


ATTTGCCGCA 


ATCTTTTTAG 


GTGTATTGAT 


11100 




GTTTTTCATT 


TGGCCAAGCA 


TACAAGCCGG 


CATTTATCAT 


GTTGGTGGAT 


TTGTAACGAA 


11160 


15 


AACAGGTGCC 


ATCGGTACTT 


TTGTTTATGG 


CTTCATCTTA 


AGATTGTTAG 


GTCCACTCGG 


11220 




TTTACACCAT 


ATTTTTTACT 


TACCGTTTTG 


GCAGACGGCA 


CTTGGTGGTA 


CTTTAGAAGT 


11280 




CAAAGGGCAC 


TTAGTTCAAG 


GTACGCAGAA 


CATCTTCTTT 


GCTCAACTTG 


GTGATCCAGA 


11340 


20 


TGTGACGAAG 


T ATT ATT CAG 


GTGTGTCACG 


CTTTATGTCA 


GGCCGTTTTA 


TTACGATGAT 


11400 




GTTCGGCTTA 


TGTGGTGCCG 


CACTTGCAAT 


TTATCACACA 


GCTAAACCTG 


AACATAAAAA 


11460 




AGTTGTCGGC 


GGTTTAATGT 


TATCCGCTGC 


ACTCACTT CA 


TTTTTAACAG 


GTATTACCGA 


11520 


25 


ACCTTTAGAG 


TTTAGTTTCT 


TGTTTGTCGC 


ACCTATTCTT 


TATGTAATCC 


ATGCCTTCTT 


11580 




TGATGGATTA 


GCATTTATGA 


TGGCAGACAT 


TTTCAACATT 


ACAATTGGTC 


AAACCTTCAG 


11640 




x vjvjnuvL XXX 


ri x ^>VjA1 1 1L1 


TACTCTTTGG 


TGTGCTACAA 


GGTAATAGTA 


AAACAAACTA 


11700 


30 


CCTATACGTC 


ATACCTATTG 


GAATTGTGTG 


GTTCTGTTTG 


TATTACATCG 


TTTTCAGATT 


11760 




CTTAATTACG 


AAATTTAATT 


TCAAAACACC 


TGGTCGAGAA 


GATAAAGCTG 


CAGCACAACA 


11820 


35 


AGTTGAGGCT 


ACTGAAAGAG 


CACAAACTAT 


TGTTGCTGGT 


TTGGGAGGCA 


AAGATAACAT 


11880 




TGAAATCGTT 


GACTGTTGTG 


CAACGAGACT 


ACGCGTCACA 


CTTCATCAAA 


ATGACAAAGT 


11940 




CGATAAAGTA 


TTACTCGAAA 


GTACTGGTGC 


CAAAGGTGTA 


ATCCAGCAAG 


GCACTGGTGT 


12000 


40 


GCAAGTAATT 


TATGGGCCTC 


ACGTTACAGT 


TATCAAAAAT 


GAAATTGAAG 


AATTGCTCGG 


12060 




GGATTAAGAC 


TAACCGAAAT 


ATCAACAGAA 


CTAATGGCAA 


CGATGTACGA 


AGTAAGAAGT 


12120 




GACATCGTTG 


CTTTTATTTT 


TAATGTTACA 


TTTGAAGCAT 


TAAGTTCATC 


ATGCACTGTA 


12180 


45 


GTGAGCCCGC 


AAATCGCCTC 


TGCTAGACAA 


TCATCTTAAT 


GCTATGATTA 


AAGCTTAAGT 


12240 




GCCAGATTTG 


AATTTAATTT 


CAACAACGAC 


TTTCACTACA 


TTAAAAATAG 


GGCCACTCGA 


12300 




CACATATAGT 


TGTATCAAAT 


AGCCCTTTAT 


ACAATTTTTT 


GGGTAAGGTT 


TTACAATTTT 


12360 


SO 


TGGGATGGTA 


TAGATTTTAT 


AAAAAGTTAT 


TTAAGTTCTT 


CTGCTTCAGC 


CATAATATCT 


X2420 




TTTAATGTTT 


TAGCTGAATG 


TGCGAACTTG 


CTTTGTTCTT 


CGTCGTTTAA 


TGGGATTTCT 


12480 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 12660 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AG CGTGTG AC 1278 0 

CATACTGGTA ATTCAGTGTC AC CATGTTCA CCAATAATTT GAG CATCG AC GCTACGTGGC 12 84 0 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12 900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTG CAT AC GCTAAAATAT 12 960 

15 CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGC CATT ACTTCACCAA 13 020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13 0 80 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATT CGC 1314 0 

20 CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA T CCATAACAT 132 00 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13 320 

TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 133 80 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 135 00 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 62 0 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13 6 80 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 13740 

GGTGTGCAAA CAGACAATTG T ACACAATG C TTATTGATAA GTATTTAAAA AATTAAAAAT 138 00 

40 GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13 860 

TCGAAATTXT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13 92 0 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

45 ATTACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14 04 0 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAG CACAAG C TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

50 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA G AAAATG AC C CTGTCATTCA AGCTTGGGCA 142 80 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 8779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 



IS 



20 



25 



30 



.35. 



40 



45 



GGTATTTTnG 


GAnGGGTACC 


TAAAGCAATT 


CCGGCAAAGG 


GTnAATCCAG 


GTACCGAAAT 


GGACTTCCCG 


TTATCGATAA 


TACCGACATA 


TATTGTGACA 


AGTAGATTTT 


ATGGACATTT 


AGGCTTACTT 


TTACTTGTGA 


TAATTGCATG 


TATGTTTACT 


GGTATTTA t C 


CaTCaATACA 


TATCATTCAA 


TTATTGATAT 


ATGTACCGTT 


TTGTTTTTTC 


TTAACTGCCt 


CGGTGACGTT 


ATTAACATCA 


ACACTCGGTG 


TGTTAGTTAG 


AGATACACAA 


ATGTTAATGC 


AAGCAATATT 


AAGAATATTA 


TTTTACTTTT 


CACCAATTTT 


GTGGCTACCA 


AAGAACCATG 


GTATCAGTGG 


TTTAATTCAT 


GAAATGATGA 


AATATAATCC 


AGTTTACTTT 


ATTGCTGAAT 


CATACCGTGC 


AGCAATTTTA 


TATCACGAAT 


GGTATTTCAT 


GGATCATTGG 


AAATTAATGT 


TATACAATTT 


CGGTATTGTT 


GCCATTTTCT 


TTGCAATTGG 


TGCGTACTTA 


CACATGAAAT 


ATAGAGATCA 


ATTTGCAGAC 


TTCTTGTAAT 


ATATTTATAT 


GACGAAACCC 


CGCTAACCAT 


TAATAAATGG 


AAGTGGGGTT 


CATTTTTGTT 


TATAATTTAA 


GTAAATAACA 


TATTAAGTTG 


GTGTATTATG 


AACGTTTTAA 


TAAAGAAATT 


TTATCATTTG 


GTAGTTCGAA 


TACTTTCTAA 


AATGATTACG 


CCTCAAGTGA 


TTGATAAACC 


GCATATCGTA 


TTTATGATGA 


CTTTTCCAGA 


AGATATTAAG 


CCTATCATCA AAGCATTAAA 


TAATTCGTCG 


TATCAGAAAA 


CTGTTTTAAC 


AACACCAAAA 


CAACffiGCCTT 


ATTTATCTGA 


ACTTAGCGAC 


GATGTTGATG 


TGATAGAAAT 


GACTAATCGA 


ACATTGGTAA 


AACAAATTAA 


GGCTTTGAAA 


AGCGCGCAGA 


TGATTATTAT 


CGATAATTAT 


TACCTATTGC 


TAGGTGGATA 


TAATAAGACT 


TCTAATCAAC 


ACATTGTTCA 


AACGTGGCAT 


GCAAGTGGTG 


CATTAAAAAA 


CTTTGGCTTA 


ACAGATCATC 


AAGTCGATGT 


GTCTGACAAG 


GCAATGGTTC 


AGCAGTACCG 


TAAAGTTTAT 


CAAGCGACGG 


ATTTTTACTT 


AGTGGGTTGT 


GAACAAATGT 


CACAATGTTT 


TAAACAGTCT 


TTAGGTGCAA 


CAGAAGAGCA 


AATGCTGTAT 



TTTGGGCTTC CGAGAATTAA TAAATATTAC ACAGCTGATA GAGCAACGGT TAAGGCAGAG 

SO 

TTAAAGGATA AATATGGAAT TACAAATAAG TTGGTATTAT ATGTACCAAC ATATAGAGAA 
GATAAAGCAG ATAATAGGGC TATTGATAAA GCTTATTTTG AAAAATGTTT ACCAGGATAT 
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ATCGACACGT 


CTACATTAAT 


GCTAATGTCA 


GATATAATTA 


TTAGCGACTA 


TAGTTCGCTG 


1500 




CCAATAGAAG 


CTAGCTTGTT 


AGATATTCCA 


ACTATATTTT 


ATGTGTATGA 


TGAAGGAACA 


1560 


5 


TATGATCAGG 


TGAGAGGCCT 


GAATCAATTT 


TACAAAGCAA 


TACCGGATAG 


CTACAAAGTG 


1620 




TATACTGAAG 


AAGATTTAAT 


AATGACGATA 


CAAGAAAAAG 


AACATCTATT 


AAGTCCGTTA 


1680 


10 


TTTAAAGATT 


GGCATAAGTA 


TAATACTGAT 


AAAAGTTTAC 


ATCAGCTCAC 


AGAATATATA 


1740 


GATAAGATGG 


TGACAAAATG 


AGGTTTACGA 


TAATCATACC 


TACATGTAAT 


AATGAGGCAA 


1800 




CAATTCGACA 


ATTGTTAATA 


TCTATTGAGA 


GTAAAGAACA 


CTATAGAATC 


CTTTGTATTG 


1860 


15 


ATGGTGGTTC 


TACTGATCAA 


ACAATTCCTA 


TGATTGAACG 


GTTACAAAGA 


GAACTCAAGC 


1920 




ATATTTCATT 


AATACAATTA 


CAAAATGCTT 


CGATAGCTAC 


GTGTATTAAT 


AAAGGTTTGA 


1980 




TGGATATCAA 


AATGACAGAT 


CCACATGATA 


GTGACGCATT 


TATGGTCATA 


AAACCAACAT 


2040 


20 


CAATCGTATT 


GCCAGGTAAA 


TTAGATAGGT 


TAACTGCTGC 


TTTCAAAAAT 


AATGATAATA 


2100 




TTGATATGGT 


AATAGGGCAG 


CGAGCTTACA 


ATTAC CATGG 


TGAATGGAAA 


TTGAAAAGTG 


2160 




CTGATGAGTT 


TATTAAAGAC 


AATCGAATCG 


TTACATTAAC 


GGAACAACCA 


GATTTGTTAT 


2220 


25 


CAATGATGTC 


TTTTGACGGA 


AAGTTATTCA 


GTGCTAAATT 


TGCTGAATTA 


CAGTGTGaCG 


2280 




AAACTTTAGC 


TAACaCATAC 


AATCACGCAA 


TACTTGTCAA 


GGCGATGCAA 


AAAGCTACGG 


2340 




ATATACATTT 


AGTTTCACAG 


ATGATTGTCG 


GAGATAACGA 


TATAGATACA 


CATGCTACAA 


2400 


30 


GTAACGATGA 


AGATTTTAAT 


AGATATATCA 


CAGAAATTAT 


GAAAATAAGA 


CAACGAGTCA 


2460 




TGGAAATGTT 


ACTATTACCT 


GAACAAAGGC 


TATTATATAG 


TGATATGGTT 


GATCGTATTT 


2520 


35 


TATTCAATAA 


TTCATTAAAA 


TATTATATGA 


ACGAACACCC 


AGCAGTAACG 


CACACGACAA 


2580 


TTCAACTCGT 


AAAAGACTAT 


ATTATGTCTA 


TGCAGCATTC 


TGATTATGTA 


TCGCAAAACA 


2640 




TGTTJGACAT 


TATAAATACA 


GTTGAATTTA 


TTGGTGAGAA 


TTGGGATAGA 


GAAATATACG 


2700 


40 


AATTGTGGCG 


ACAAACATTA 


ATTCAAGTGG 


GCATTAATAG 


GCCGACTTAT 


AAAAAATTCT 


2760 




TGATACAACT 


TAAAGGGAGA 


AAGTTTGCAC 


ATCGAACAAA 


ATCAATGTTA 


AAACGATAAC 


2820 




GTGTACATTG 


ATG AC CAT AA 


ACTGCAATCC 


TATGATGTGA 


CAATATGAGG 


AGGATAACTT 


2880 


45 


AATGAAACGT 


GTAATAACAT 


ATGGCACATA 


TGACTTACTT 


CACTATGGTC 


ATATCGAATT 


2940 




GCTTCGTCGT 


GCAAGAGAGA 


TGGGCGATTA 


TTTAATAGTA 


GCATTATCAA 


. CAGATGAATT 


3000 




TAATCAAATT 


AAACATAAAA 


AATCTTATTA 


TGATTATGAA 


CAACGAAAAA 


TGATGCTTGa 


3060 


SO 


ATCAATACGC 


TATGTCGATT 


TAGTCATTCC 


AGAAAAGGGC 


TGGGGACAAA 


AAGAAGACGA 


3120 




TGTCGAAAAA 


TTTGATGTAG 


ATGTTTTTGT 


TATGGGACAT 


GACTGGGAAG 


GTGAATTCGA 


3180 
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TAAAATCAAA CAAGAATTAT ATGGTAAAGA TGCTAAATAA ATTATATAGA ACTATCGATA 3300 

CTAAACGATA AATTAACTTA GGTTATTATA AAATAAATAT AAAACGGACA AGTTTCGCAG 3360 

CTTTATAATG TGCAACTTGT CCGTTTTTAG TATGTTTTAT TTTCTTTTTC TAAATAAACG 3420 

ATTGATTATC ATATGAACAA TAAGTGCTAA TCCAGCGACA AGGCATGTAC CACCAATGAT 3480 

AGTGAATAAT GGATGTTCTT CCCACATACT TTT AG CAACA GTATTTGCCT TTTGAATAAT 354 0 

TGGCTGATGA ACTTCTACAG TTGGAGGTCC ATAATCTTTA TTAATAAATT CTCTTGGATA 3600 

GTCCGCGTGT ACTTTACCAT CTT CG ACT AC AAGTTTATAA TCTTTTTTAC TAAAATCACT 36 60 

TGGTAAAACA TCGTAAAGAT CATTTTCAAC ATAATATTTC TTACCATTTA TCCTTTGCTC 3720 

ACCTTTAGAC AATATTTTTA CATATTTATA CTGATCAAAT GAGCGTTCCA TTAATGCATT 3780 

CCCCATCATA TTACGTTGCT TCTCGCCACC AAGGTTTTTA TAGTCTCCTG CACCCATGAT 384 0 

AACTTGATTA ATTCTAAATT TACCTCGTTT GGTAGTAATC GTATGGTTGT AATTTG CTGT 3900 

ATCACTTGAT CCAGTTTTTA AACCATCTGT ACCCGGCAAA CTCATTTTTG CACCTTCCAA 3960 

TGAAAAGTTG AATGTGTAAT ACGTAACTGC ATGCGTTGTT GGTGCTAACT GCTTTGTAAA 4020 

2s GTCTAATATT TTAGGTGTCT CTTTAATCAC GTGTAAATCT AAAATGGCAT AGTCTCTAGC 4080 

AGTCGTTACA GTACGTTCTT GGTCTTTATA CTTTGTTGGT GCAAATGTAC GTAAT CTTGA 4140 

ATTTTCAGCA CCCGTTGGAT TGACGAAATG TGTATTTTTC ATTCCGATAG CTTTAGCTTT 4200 

30 GTTATTCATT AAATCAACGA AATCGCTGGT GTTTTTTGAA ACCTTCTTAG CTAAAATTAA 4260 

TGCCGCGGCA TTACTAGAAT TAGATACTGT AATTTGTAAT AGGTCTGCGA TTGTCCATAC 4320 

TTGTCCAGGA TATAGTTTCG TATTACTCAA CTCAGGTAGT GTAGACATAA TATATTCTTT 43 80 

~ 3S GTTCGTCATT GTGACTGTGT CATCAAGTGA AAGCTGCCCC TTATTTACAG CTTCCAATGT 4440 

TAAGTACATT GTCATTAATT TAGTCATAGA CGCTGGAtTC CACTTAGTAT CGATATTGTA 4500 

TTGATACAGT AATTGTCCAG TTTGACTTAC ATTAACAGCA CTCGTCGGTT CGTATGCAGC 4560 

CGACAAACCT GCATAACCAT ATTGATTTGC TGCTTGTACA GGGGTTACGT CACTGTTAGT " 4620 

AGCTTGTGCA TATGGTGTCA TAATACTTAA TGTTAAACAT AAAATGATGA TAATAGATAT 4680 

TAAATTTTTC ATAAAGCGTT AATCTTCCCT TTTCCAATTC TTAAATATTC CCTAAAAGCA 474 0 

ATGGTTATTC CTACTTACGG AAATCATTGC TAATTCACTT CACCTTAATT AAATTGTTGA 4 800 

AAATAAAGTT TTCTGCAGTT AATTTGAAAA ATAATGCAAA TATATTACGT GTGTAGCTAA 4 860 

50 AGGTGTTATA ATGTTTGTAC GAAGAGCAAA CTTACTCAAA AGCGATTAAT TTTCATGTTT 4 920 

TAATATAAAG ACTTTGAGAA GTTATTACAA AAAATGCAAT AG AAAT ATT C TATCATATAA 4 980 



£5 



40 



45 
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AAGTATATGA 


TAGAAATGCA 


TGTATCTATC 


TAAATGAATT 


AACTATAAAT 


TTCAAACAGA 


5100 




AGAGGTAAAA 


CTATGAAACG 


AGAAAATCCA 


TTGTTTTTCT 


TATTTAAAAA 


ACTATCATGG 


5160 


5 


CCAGTGGGTC 


TTATCGTTGC 


AGCTATCACT 


ATTTCATCAC 


TAGGGAGCTT 


AAGTGGACTA 


5220 




TTAGTGCCAC 


TGTTTACTGG 


ACGAATTGTA 


GATAAATTTT 


CCgTGAGCCA 


TATCAATTGG 


5280 


10 


AATCtAATCG 


CATTATTTGG 


TGGTATCTTT 


GTCATCAATG 


CTTTATTAAG 


CGGATTAGGT 


5340 


TTATATTTAT 


TAAGTAAAAT 


TGGTGAAAAG 


ATTATTTATG 


CGATACGCTC 


AGTTTTATGG 


5400 




GAGCATATCA 


TACAATTAAA 


AATGCCATTC 


TTTGACAAAA 


ATGAAAGTGG 


TCAATTAATG 


5460 


15 


AGTCGATTAA 


CTGACGATAC 


GAAAGTGATA 


AATGAATTTA 


TTTCACAAAA 


GCTACCTmAC 


5520 




TTATTACCAT 


CAATCGTTAC 


AT t AGTTGGG 


TCACTAATCA 


TGTTATTTAT 


TTTAGATTGG 


5580 




AAAATGACAT 


TATTAACATT 


TATAACGATA 


CCGATATTCG 


TTTTaATTAT 


GATTCCTCTA 


5640 


20 


GGTCGTATTA 


TGCAAAAGAT 


ATCGACAAGT 


ACACAATCTG 


AAATTGCAAA 


CTTCAGTGGT 


5700 




TTGTTAGGGC 


GTGTCCTAAC 


TGAAATGCGT 


CTTGTTAAAA 


TATCAAATAC 


AGAGCGTCTT 


5760 




GAATTAGATA 


ATGCACATAA 


AAATTTGAAT 


GAAATATATA 


AATTAGGTTT 


AAAACAGGCT 


5820 


25 


AAAATTGCGG 


CAGTTGTACA 


ACCAATTTCA 


GGTATAGTTA 


TGTTGCTAAC 


AATTGCAATT 


S880 




ATTTTAGGTT 


TTGGTGCATT 


AGAAATTGCG 


ACTGGTGCAA 


TCACTGCAGG 


TACATTAATT 


5940 




GCAATGATAT 


TTTATGTTAT 


TCAGTTATCT 


ATGCCTTTAA 


TCAATCTTTC 


CACGTTAGTT 


6000 


30 


ACAGATTATA 


AAAAGGCAGT 


CGGTGCAAGT 


AGTAGAATAT 


ACGAAATCAT 


GCAAGAACCT 


6060 




ATTGAACCGA 


CAGAAG CTCT 


TGAAGATTCT 


GAAAATGTAT 


TAATTGATGA 


CGGTGTATTG 


6120 


35 


TCATTTGAAC 


ATGTAGACTT 


TAAATATGAT 


GTGAAGAAAA 


TATTAGATGA 


TGTGTCGTTC 


6180 


CAAATCCCAC 


AAGGTCAAGT 


GAGTGCTTTT 


GTAGGCCCTT 


CTGGGTCTGG 


TAAAAGTACG 


6240 




ATATTTAATC 


TGATAGAACG 


TATGTATGAA 


ATTGAGTCAG 


GTGATATTAA 


ATATGGCCTT 


6300 


40 


GAAAGTGTCT 


ATGATATCCC 


GTTATCTAAG 


TGGCGACGCA 


AAATTGGATA 


TGTTATGCAA 


6360 




TCAAATTCGA 


TGATGAGTGG 


TACAATTAGA 


GACAATATTT 


TATACGGAAT 


TAATCGTCAT 


6420 




GTTTCAGATG 


AAGAACTTAT 


TAATTATGCT 


AAATTAGCGA 


ACTGTCATGA 


TTTTATCATG 


6480 


45 


CAATTTGATG 


AAGGATATGA 


CACGCTTGTA 


GGTGAACGAG 


GATTGAAACT 


GTCTGGCGGA 


6540 




CAACGTCAAC 


GTATTGATAT 


TGCTAGAAGT 


TTTGTTAAAA 


ATCCTGATAT 


TTTGTTACTT 


6600 




GATGAAGCAA 


CAGCTAATCT 


CGATAGTGAA 


AGTGAATTGA 


AAATTCAAGA 


AGCTTTAGAA 


6660 


SO 


ACATTGATGG 


AAGGTAGAAC 


AACGATTGTC 


ATTGCGCATC 


GTTTGTCTAC 


AATTAAAAAA 


6720 




GCCGGTCAAA 


TTATATTCTT 


AGACAAAGGA 


GAGGTAACAG 


GTAAAGGTAC 


GCATTCAGAA 


6780 
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TTTTATATAT ATAAGTAAGC TTGGAGCAAA TACACATATA CCATCGAGGA AATTAAAGTG 
TGGCACATTG ATGGATATAG ATGTTAATAA ATTGCTTCAA GCTTTTGTCT ATTTTAAATC 
ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 
ATGCTCATGG TATAATACAA GTTATAAGCA AACATACATA TATTAAATAC TGTAGCCACG 
AGTCATAATT CTTCATATTT TACATAGCAA TTTAACTGAT TTTAGAGTCC ACGGTACAGA 
AGTTTGATAT TTCAATGTTT CTAAATTTTT AAAAAATTAA ATCATAGGTG GGTGCCAAAT 
GTTTTTATTA ATCAACATTA TTGGTCTAAT TGTATTTCTT GGTATTG CGG TATTATTTTC 
AAGAGATCGC AAAAATATCC AATGGCAATC AATTGGGATC TTAGTTGTTT TAAACCTGTT 
TTTAGCATGG TTCTTTATTT ATTTTGATTG GGGTCAAAAA GCAGTAAGAG GAGCAGCCAA 
TGGTATCGCT TGGGTAGTTC AGTCAGCGCA TGCTGGTACA GGTTTTGCAT TTGCAAGTTT 
GACAAATGTT AAAATGATGG ATATGGCTGT TGCAGCCTTA TTCCCAATAT TATTAATAGT 
GCCATTATTT GATATCTTAA TGTACTTTAA TATTTTACCG AAAATTATTG GAGGTATTGG 
TTGGTTACTA GCTAAAGTAA CAAGACAACC TAAATTCGAG TCATTCTTTG GGATAGAAAT 
GATGTTCTTA GGAAATACTG AAGCATTAGC CGTATCAAGT GAGCAACTAA AACGTATGAA 
TGAAATGCGT GTATTAACAA TCG CAATGAT GTCAATGAGC TCTGTATCGG GAGCTATTGT 
AGGTGCGTAT GTACAAATGG TACCAGGAGA ACTGGTACTA ACGGCAATTC CACTAAATAT 
CGTTAACGCG ATTATTGTGT CATGCTTGTT GAATCCAGTA AGTGTTGAAG AGAAAGAAGA 
TATTATTTAC AGTCTTAAAA ACAATGAAGT TGAACGTCAA CCATTCTTCT CATTCCTTGG 
AGATTCTGTA TTAGCAGCAG GTAAATTAGT ATTAATCATC ATCGCATTTG TTATTAGTTT 
TCTA<^CTTJ^GCTGATCTAT TTGATCGTTT^ATCAATfTC 

ATGGATAGGC ATAAAAGGTA GTTTCGGTTT AAACCAAATT TTAGGTGTGT TTATGTATCC 
ATTTGCGCTA TTACTCGGTT TACCTTATGA TGAAGCGTGG TTGGTAG CAC AACAAATGGC 
TAAGAAAATT GTTACAAATG AATTTGTTGT TATGGGTGAA ATTTCTAAAG ATATTGCATC 
TTATACACCA CACCATCGTG CGGTTATTAC AACATTCTTA ATTTCATTTG CAAACTTCTC 
AACGATTGGT ATGATTATCG GTACATTGAA AGGCATTGTT GATAAAAAGA CATCAGACTT 
TGTATCTAAA TATGTACCTA TGATGCTATT ATCAGGTATC CTAGTTTCAT TATTAACAGC 
AGCTTTCGTT GGTTTATTTG CATGGTAATA TGTCGAAGAG TGACTATGAT AATACATTTT 
AACTAATAAA TATGTCCAGG CATGTCGTCT ATTGATATAG GTGAGATGCT TGGACTTTTT 
TATTATTGAT ATAAAGGTAT nTAAATATTT TTAAAGTTAC CGAAATTGAA GCATTATAAA 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 
CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA A CATT AT AG C AGACAATGAA 
ATGAAAGTAA ATTAAAAAT 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31096 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
GTTG CAGTAG TCAAAGAATT AAACAAGGTG AAGGcGTGTA GCTTGCACAC CCGAAAATGT 
20 GCGTAAGTTA a CGG ATGCAG GACATAAAGT AATTGTTGAA AAAAATGCTG GCATTGGTTC 

AGGATTTTCT AACGATATGT ATGAAAAAGA AGGCGCTAAG ATCGTAACTC ACGAACAAGC 
ATGGGAAGCT GATCTTGTTA TCAAAGTAAA AGAACCTCAT GAAAGCGAAT AT CAAT ATTT 
CAAAAAGAAT CAAATTATCT GGGGATTTTT ACATCTAGCA TCTTCAAAAG AAATAGTAGA 
AAAAATGCAA GAAGTTGGTG TAACTGCGAT TAGTGGTGAA AC CATTAT AA AAAATGGAAA 
AG CAG AATT A TTAGCGCCAA TGAGTGCTAT AGCAGGTCAA CGCTCAGCAA TTATGGGAGC 
TTACTACTCT GAAGCACAAC ATGGTGGTCA AGGTACTTTA GTGACTGGTG TACATGAAAA 
TGTGGATATA CCTGGTAGTA CAT ATG TG AT TTT CGGTGGT GGAGTAGCAG CAACAAATGC 
AGCAAATGTT GCCTTGGGAC TAAATGCTAA AGTAATCATT ATCGAGTTAA ACGATGACCG 
CATTAAATAT CTTGAAGATA TG T ATG CAG A AAAAGATGTC ACAGTAGTCA AATCAACACC 
AGA^AATTTA GCAGAACAAA TTAAGAAAGC AG ATG T ATTT ATTTCTACAA TTTTAATTTC 



6700 
8760 
8779 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 

40 AGGTGCGAftA CCGCCAAAAT TGGTT ACT CG TGAGATGGTT AAATCAATGA AAAAAGGTTC 780 

AGTATTAATC GATATAGCTA TTGACCAAGG TGGAACTATT GAAACAATTA GACCAACTAC 840 
AATTTCTGAT CCAGTGTATG AAGAAGAAGG TGTGATTCAT TATGGTGTAC CAAATCAACC 900 

45 AGGAGCAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATTG ATTATATATT 960 

AGAAATTTGT GACAAAGGCT TAGAACAAGC AATTAAAGAT AATGAAGCCT TAAGTACTGG 1020 
TGTAAACATT TACCAAGGAC AAGTGACAAA TCAAGGATTA GCTTCATCAC ATGACCTAGA 1080 

50 TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 114 0 

GAATATTTTA AATAT AG CAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 1200 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA ATTAAATCAA 132 0 

TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 138 0 

TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATllAAAA TTAATCACTT AAGAGATGAA 144 0 

CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 1500 

AACAGCTAAA TTATTAGGCA TTGATGCAAC GATTGTAATG CCTGAAACAG CACCACAAGC 156 0 

GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 162 0 

CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA TCGTTCATCC 16 8 0 

ATATGACGAT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 1740 

TATTTGGAAT GTGAATACAG TCATCGTACC AGTTGGCGGT GGAGGATTAA TTG CAGGT AT 1800 

TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC AATCTGAGAA 186 0 

TGTTCATGGT ATGGCTGAGT CTTTCTATAA GAGAGATTTA ACTGAACATC GAGTGGATAG 192 0 

CACAATAGCA GATGGTTGTG ATGTAAAAGT TCCTGGTGAA CAAACATATG AAGTAGTTAA 198 0 

ACATTTAGTA GATGAATTTA TTCTTGTTAC TGAAGAAGAA ATTGAACATG CTATGAAAGA 2 04 0 

25 TTTAATGCAG CGTGCCAAAA TTATTACTGA AGGTGCAGGC GCATTACCAA CAGCTGCAAT 2100 

TTTAAGTGGA AAAATAAACA ATAAATGGCT TGAAGATAAA AATGTTGTTG CATTAGTTTC 2160 

AGGCGGGAAT GTTGACTTAA CTAGAGTTTC AGGTGTCATT GAACATGGAC TGAATATTGC 222 0 

30 AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG GTGTAATTAT 228 0 

GTCAAATGGT AAAGAATTAC AAAAAAATAT AGGTTTCTTC TCAGCGTTTG CTATTGTTAT 2 34 0 

GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA AACGTAACAG AAGTAACAGG 2400 

35 AACAGCAGGA ATGGCCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA TTTGTGCGGG 2460 

GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGGCTTAA CGAAGTATAT 2520 

AGAATATACA TACGGTGATT TCTGGGGCTT CCTATCAGGT TGGGCGCAAT CATTTATTTA 2580 

TTTTCCAGCT AACGTAGCAG CATTGTCTAT CGTATTTGCG ACACAGCTAA TTAATTTATT 2640 

CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT CTATTGTGTT 2700 

GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT TAGTAATTAA 2760 

ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG ATATCACTTT 2 820 

TTCATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA TTGGTAGTGG 2880 

SO TTTATTAGCA ACTATGTTTG CATATGATGG TTGGATTCAT GTAGGAAATG TTGCGGGGGA 2 940 

ACTTAAAAAT CCTAAACGCG ATTTACCTTT AGCGATTTGA GTTGGTATCG GTTGTATTAT 3000 
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TGGTAATTTA AATGCAGCTT 


' CAGATACATC 


AAAAATATTA 


TTTGGTGAAA 


ATGGCGGTAA 


3120 




GATTATTACA 


> ATCGGTATAT 


TAATTTCTGT 


TTATGGTACG 


ATCAATGGCT 


ATACTATGAC 


3180 


5 


TGGTATGCGC 


GTACCATATG 


CAATGGCTGA 


AAGAAAATTA 


TTGCCATTTA 


GCCATTTATT 


3240 




CGCAAAATTA 


ACAAAATCTG 


GCGCACCATG 


GTTTGGCGCA 


ATTATACAAC 


TTATAATCGC 


3300 


10 


TAT CAT CATG 


ATGTCAATGG 


GAGCATTTGA 


TACAATTACA 


AATATGTTAA 


TCTTTGTTAT 


3360 


TTGGTTGTTC 


TATTGTATGT 


CATTTGTTGC 


GGTAATAATT 


TTAAGAAAAC 


GTGAACCAAA 


3420 




TATGGAACGA 


CCATATAAAG 


TACCGTTATA 


TCCGATCATA 


CCTTTAATTG 


CTATTTTGGC 


3480 


15 


AGGATCATTT 


GTATTAATTA 


ATACACTGTT 


TACACAATTT 


ATATTAGCAA 


TCATTGGAAT 


3540 




TCTAATAACA 


GCACTTGGTA 


TACCAGTTTA 


TTACTATAAA 


AAGAAACAAA 


AAGCAGCATA 


3600 




AGGTAAGATA 


ACTAGCATTG 


AGAATAAATG 


GATGGACTAC 


TAATAAATTT 


AAAGTTTTAC 


3660 


20 


ACATTAAAAT 


CAAAAACCAT 


TCAATTATTC 


TATGGAACAG 


ACAAATTTCT 


GTTATGGAAT 


3720 




TTGTCTGTTT 


TTCAAAAGTA 


TAGGGAGGCA 


AATAGAGATG 


GAAAAGC CGT 


CAAGAGAGGC 


3780 




ATTTGAAGGC 


AATAATAAGT 


TGTTAATAGG 


AATTGTTCTA 


AGTGTAATAA 


CGTTTTGGCT 


3840 


25 


ATTTGCACAA 


TCATTGGTTA 


ATGTTGTACC 


AATACTTGAA 


GATAGTTTCA 


ATACAGATAT 


3900 




TGGAACGGTT 


AATATCGCCG 


TTAGTATAAC 


TGCTTTATTT 


TCAGGAATGT 


TTGTAG TAGG 


3960 




AGCAGGTGGT 


CTTGCTGATA 


AATATGGCAG 


AATTAAACTC 


ACGAACATTG 


GTATTATCTT 


4020 


30 


AAATATATTA 


GGTTCATTAT 


TAATCATTAT 


TTCAAATATT 


CCTTTATTAC 


TTATTATAGG 


4080 




AAGATTAATT 


CAAGGACTTT 


C AG CAG CATG 


TATTATGCCT 


GCAACTTTGT 


CTATTATTAA 


4140 


35 


GTCATATTAC 


ATTGGGAAAG 


ATAGACAACG 


CGCTTTAAGT 


TATTGGTCAA 


TTGGCTCATG 


4200 


GGGCGGCTCT 


GGTGTTTGTT 


CATTTriTGG 


AGGTGCAGTT 


GCAACGCTTT 


TAGGTTGGCG 


4260 




TTGGATTTTC 


ATCCTATCAA 


TTATAATTTC 


ATTAATTGCA 


CTGTTTCTTA 


TTAAAGGCAC 


4320 


40 


ACCTGAAACT 


AAATCTAAAT 


CGATTTCTCT 


AAATAAATTT 


GACATTAAAG 


GTCTGGTTCT 


4380 




TTTAGTCATT 


ATGCTCCTCA 


GTTTAAATAT 


TTTAATTACT 


AAAGGATCAG 


AATTAGGTGT 


4440 




AACCTCACTT 


CTTTTTATTA 


CTTTATTAGC 


TATTGCAATT 


GGATCTTTTA 


GTTTATTTAT 


4500 


45 


AGTTCTTGAA 


AAGCGTGCTA 


CAAATCCTTT 


AATCGATTTT 


AAATTATTTA 


AAAATAAAGC 


4560 




TTACACAGGT 


GCAACAGCTT 


CAAACTTTTT 


GTTAAATGGT 


GTTGCAGGAA 


CATTAATAGT 


4620 




AGCCAACACA 


TTTGTTCAAA 


GAGGTTTAGG 


ATATTCTTCA 


TTG CAAGCAG 


GAAGTTTATC 


4680 


50 


AATCACTTAT 


TTAGTAATGG 


TACTAATTAT 


GATTCGTGTT 


GGTGAAAAGT 


TACTTCAAAC 


4740 




ACTCGGATGC 


AAGAAACCAA 


TGTTAATTGG 


AACAGGAGTT 


CTTATTGTCG 


GAGAATGTCT 


4800 
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ATTCTTTGGT TTAGGACTAG GGATATATGC TACACCATCA ACAGATACAG CAATTGCAAA 4 920 

TGCACCGTTA GAAAAAGTAG GCGTTGCTGC AGGTATCTAT AAAATGGCTT CTGCATTAGG 4 9 80 

TGGAGCATTT GGCGTCGCAT TGAGTGGTGC AGTATATGCA ATCGTATCAA ATATGaCAAA 504 0 

CATTTATACA GGTGcAATGa TTGnCATTAT GGTTaAATGC AGGTATGGG a ATATTATCaT 5100 

TCGTTATCAT TTTGtTACTT GTGcCTAAAC mAAACGACAC TCAATTATGA TAATTGAGAA 5160 

TTAAATTGAA ATCATACAAG TCGCTACAAT ATTAAACAAA AATATAAACC GATTCTTATG 52 20 

TGTCATTATT TTAAATGAAC ATAGGGATTG GTTTTTTATT ACT CTTTTAC GCTACTTTAT 52 80 

TTATAATTAT TATAAATTGT CACAAATTCA ATTTACCTTA CAATATATTT TGTGTTATTA 534 0 

TATTCTGGAG CATAAATAAA TTGTTCAACA CATAGTTGTA ATGTGTTTCA ATACTTTTTG 54 00 

GATAGATTGC GAAATTGTAT TGAATCGTCA TCGTTTTAAA TTTTTAAATG AGAATGGAAT 54 60 

2Q GAGCATTACA ATACACAAGC AATCAAAAGT AAATACATTC ACAACACAAC AGAGACATAA 5520 

CAACAAGATA AGGAGTGAAC AATAGCTGTG AATTATCGTG ATAAAATTCA AAAGTTTAGT 55 80 

ATTCGTAAAT ATACAGTTGG TACATTTTCA ACTGTCATTG CGACATTGGT ATTTTTAGGA 564 0 

25 TTCAATACAT CACAAGCACA TGCTGCTGAA ACAAATCAAC CAGCAAGCGT GGTTAAACAG 5700 

AAACAACAAA GTAATAATGA ACAGACTGAG AATCGAGAAT CTCAAGTACA AAATTCTCAA 5760 

AATTCACAAA ATGGTCAATC ATTATCTGCT ACTCATGAAA ATGAGCAACC AAATATTAGT 5820 

30 CAAGCTAATT TAGTAGATCA AAAAGTAGCG CAATCATCTA CTACTAATGA TGAACAACCA 5880 

GCATCTCAAA ATGTAAATAC AAAGAAAGAT TCGGCAACGG CTG CGACAAC ACAACCAGAT 594 0 

AAAGAACAAA GTAAGCATAA ACAAAACGAA AGTCAATCTG CTAATAAAAA TGGAAACGAC 6000 

-35 

AATAGAGCGG CTCATGTAGA AAATCATGAA GCAAATGTAG TAACAGCTTC AGATT CATCT 6060 

GATAATGGTA ACGTACAACA TGACCGAAAT GAATTACAAG CGTTTTTTGA TGCAAATTAT 6120 

CATGATTATC GCTTTATTGA CCGTGAAAAT G CAG ATT CTG GCACATTTAA CTATGTAAAA 6180 

40 

GGCATTTTTG ATAAGATTAA TACGTTATTA GGCAGTAATG ATCCAATAAA CAATAAAGAC 6240 
TTGCAACTTG CATACAAAGA ATTGGAACAA GCTGTTGCTT TAATTCGTAC AATGCCTCAA 6300 

45 CGTCAACAGA CTAGCCGACG TTCAAATAGA ATTCAAACGC GTTCGGTTGA GTCAAGAGCT 6360 

GCAG AG CCT A GATCAGTATC AGACTATCAA AATG C AAATT CATCATATTA TGTTGAAAAT 64 20 

GCTAATGATG GTTCGGGCTA TCCTGTTGGT ACATATATCa ATGCTTCTAG TAAAGGGGCG 6480 

50 CCATATAATT TACCAACTAC AC CATGG AAT ACATTGAAGG CCTCTGACTC AAAGGAAATT 6540 

GCTCTTATGA CAGCGAAACA AACTGGAGAC GGGTACCAAT GGGTTATTAA GTTTAATAAA 6600 
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GTAGGAAGAA 


CTGAPTTTPT 


A A PA.PTT A AT* 


TCAGATGGAA 


CAAATGTACA 


ATGGAGTCAT 


6720 




GGAGCAGGAG 


CJXGGTGPAAA 


1 AAAv- L_AC_ rr 


CAACAAATGT 


GGGAATATGG 


AGTAAATGAT 


6780 


5 


CCTCATCGTT 


CACATGACTT 


TA AP A 


AATAGAAGTG 


GCCAAGTAAT 


ATATGACTGG 


6840 




CCAACTGTCC 


ATATTTATTP 


Hi ALiAAvjA 1 


1 1 ATCT AG AG 


CGAGTGATTA 


TTTTAGTGAA 


6900 




GCTGGAGCGA 


CACCTGCTA P 




GGTAGACAAA 










1 rt/iHbL 11X1 


AT'l'T^l'GAATA 


TATTAATGGT 


6960 


10 


CAAAAACCTG 


CTGAATCAPP 


rzn/~lT'f~"T m 'i ■ p r ,r n 


AAAG 1 1 1 ATA 


CTTTCATCGG 


TCAAGGTGAT 


7020 




G CAAGTT AT A 


CAATTTC A TT 

XXX X X 




GGTCCAACTG 


TTAATAAATT 


GTACTATGCA 


7080 


15 


GCAGGTGGGC 


GTGCTTTAGA 

« x vj x x x r\wn 


P.TAPA A TP A a 


TTATTTATGT 


ACAGTCAACT 


ATACGTCGAA 


7140 


TCAACG CAAG 


AC PAT P A A P A 


AUVj 1LI 1 AAT 


GGTTTAAGAC 


AAGTGGTTAA 


TCGTACATAT 


7200 




CG CATAGGTA 


PA APTt & IV PP 


TPTAPA A/"»«TV*» 

1 iAGAAGTG 


AGTCAAGGAA ATGTACAAAC GAAAAAGGTA 


7260 


20 


TTAGAAAGTA 


CA A A PPT IV a A 


1 A I AtaATGAT 


TTTnTTP, A TP. 


nl LL 1 1 1AAG 


TTATGTTAAG 


7320 




ACG CCGAGTA 


A T iv iv a rz tptt 

r\ X rtrtrtV? lull 


AbOATTTTAT 


T CG AAT AATfi 


PA A AT APT A A 
UwA 1 At 1 AA 


l\jLrrn'AGA 


7380 




CCGGGTGGAG 


pppanpiaTT 


AAA IvjAATAT 


PAATTA AP,TP 


A A *T*T* A ' I** 1 A 
AM.1 1 A I 1 1AL 


1GATCAAAAA 


7440 


25 


TTACAAGAAG 


PAP.PA ana ap 


x Atj AAACCCA 


ATAAG ATT A A 


lunl ILvval 1 1 


C- UACTAT C CT 


7500 




GATGCTTATCJ 


fJTAAT APTP A 


Ac TTTAGTT C 


CTP.TT A A PTT 
^— x \3 x x rviv. x x 


A A PPP T a TT A 
/n-rt t\j-Vj x A 1 1 A 


C CTG AAATCC 


7560 




AA CAT AAT A f 


TaAATTPTTT 
x a/in i A v_ 1 ± J. 


A A A R A TP Tv /~~*/~* 

AAAAATGACG 


ATACTCAAAA 


TATTG CTGAA 


AAACCATTTT 


7620 


30 


CAAAACAAGC 


TPPPPATPPA 


ul 11 it 1 ATG 


TATATGCAGG 


TAAC CAAGGG 


AATGCTTCCG 


7680 




TGAATTTAGG 


tgp.tap.ppta 


ALA ILi A I TC 


AACCATTACG 


TATTAATTTA 


ACAAGTAATG 


7740 




AGAATTTTAC 


AGATAAAPAT 


TPPPTA A RTT& 

IuVjLAAAI jl a 


CAGGTATTCC 


GCGTACATTA 


CACATTGAAA 


7800 


35 


ACTCGACAAA 


TAGAC PT A AT 


A ATPPPAP2&.P 
Art. x AjLL AJjAXj 


AACGCAATAT 


TGAACTTGTT 


GGTAACTTAT 


7860 




TACC&GGGGA 


TTA PTTTPP A 

x x XXX vj\J/A 


RPPRT7APPTT 
ALljAlAV_bl 1 


TTGGACGTAA 


AGAACAATTA 


TTCGAAATTC 


7920 


40 


GTGTTAAACC 


ACAT AP A PP A 


AP A ATT Ti. PR A 


CGACAGCTGA 


GCAATTAAGA 


GGTACAGCAT 


7980 


TACAAAAAGT 


GCCTGTAAT 


A 1 1 1 vAtVjv»AA 


TACCGTTGGA 


TCCATCGGCA 


TTGGTTTATT 


8040 




TAGTTGCACC 


AAPAA ATP A A 


A p*|* & PP A A T/"^ 
Av. 1 AtAjAA 1 Ca 


GTGGTAGTGA 


GGCAGATCAA 


ATACCATCTG 


8100 


45 


GTTATACGAT 


APTTP.PP A PT 


ppt a p a r*r*T*f* 


ATGGGGTGCA 


TAATACAATT 


ACTATACGAC 


8160 




CG CAAGATTA 


TGTTGTATTC 


ATACCACCTG 


TAGGTAAACA 


AATTAGAGCA 


GTAGTTTATT 


8220 




ATAATAAAGT 


AGTTGCATCT 


AATATGAGTA 


ATGCTGTTAC 


T ATTTTGC CA 


GATGACATTC 


8260 


50 


CACCAACAAT 


CAATAATCCT 


GTTGGAATAA 


ATGCCAAATA 


CTATCGAGGC 


GACGAAkCAA 


8340 




CTTTACAATG 


GGTGTCT CTG 


ATAGACATTC 


TGGTATAAAA AATACAACTA TTACGACATT 


8400 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT TAACAGTGAT ATTACATTTA AAGTGTCAGC 8520 

GACAGaCAAT GTCAATAATA CGACAAATGA TAGTCAATCT AAACATGTTT CAATTCATGT 858 0 

AGGTAAAATT AGTGAAGATG CTCATCCGAT TGTATTAGGA AATACTGAGA AAGTTGTAGT 864 0 

AGTCAATCCG ACTGCTGTAT CTAATGATGA AAAGCAAAGC ATAATTACTG CCTTTATGAA 87 0 0 

TAAAAACCAA AATATAAGAG GATATTTAGC ATCAACTGAT CCAGTAACTG TCGATAATAA 87 60 

TGGTAATGTC ACATTACATT ACCGTGATGG CTCATCGACA ACGCTTGATG CTACAAATGT 8 8 20 

GATGACATAC GAACCAGTTG TGAAACCTGA ATACCAAACT GTCAATGCTG CTAAAACAGC 88 8 0 

AACGGTAACG ATTGCTAAAG GACAATCATT TAGTATTGGT GATATTAAAC AATATTTTAC 894 0 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG CACATTTACA AATATTACAT CTGATAGAAC 9000 

TATTCCAACT G CACAAG AAG TTAGTCAAAT GAACGCAGGC ACGCAGTTAT ACCATATAAC 906 0 

TGCTACAAAT GCGTATCATA AAGATAGTGA AGACTTCTAT ATTAGTTTGA AAATCATCGA 912 0 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT ATATCGTACA TCAACATATG ATTTAACTAC 918 0 

TGATGAAATC TCAAAAGTAA AACAAGCATT TATTAATGCA AATAGAGATG TAATTACGCT 924 0 

TGCCGAAGGT GATATTTCAG TTACAAATAC ACCTAATGGT GCTAATGTAA GTACTATTAC 93 00 

AGTAAATATT AATAAAGGTC GATTAACGAA ATCATTCGCG TCAAACCTAG CTAATATGAA 93 6 0 

TTTCTTGCGT TGGGTTAATT TCCCACAAGA TTATACAGTG ACATGGACGA ATGCAAAAAT 9420 

TGCAAACAGA CCAACAGATG GTGGTTTATC ATGGTCTGAT GACCATAAAT CTTTAATTTA 94 8 0 

TCGTTATGAT GCTACATTAG GTACTCAAAT TACGACGAAT GATATTTTAA CAATGTTAAA 954 0 

AGCAACAACT ACAGTGCCTG GATTGCGAAA TAACATTACT GGTAATGAAA AATCACAAGC 9600 



AGAAGCTGGC GGAAGACCTA ACTTTAGAAC GACTGGTTAT TCACAATCAA ATGCGACAAC 966 0 

TGATGGTCAA CGTCAATTTA CGTTGAATGG TCAAGTGATT CAAGTGTTAG ACATCATCAA 972 0 

CCCTTCAAAC GGTTATGGTG GGCAACCTGT TACAAATTCA AATACTCGTG CAAACCATAG 9780 

TAACTCAACT GTTGTTAACG TAAACGAACC GGCAGCTAAT GGTGcTGGCG CATTTACAAT 9840 

TGACCACGTT GTAAAAAGTA ATTCTACACA TAATGCAAGT GATGCAGTTT ATAAAGCACA 9900 

GTTATACTTA ACGCCATATG GTCCAAAACA ATATGTTGAA CATTTAAATC AAAATACAGO 9960 

AAATACTACT GACGCTATTA ACATTTATTT TGTACCAAGT GACTTAGTGA ATCCAACAAT 10020 

TTCAGTAGGT AATTACACTA ATCATCAAGT GTTCTCAGGT GAAACATTTA CAAATACTAT 10080 

TACAG CGAAT GATAACTTTG GTGTGCAATC TGTAACTGTA CCAAATACAT CACAAATTAC 10140 

AGGTACTGTT GATAATAACC ATCAACATGT TTCTGCAACG GCACCAAATG TGACATCAGC 10200 
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GTTCAATGTA 


ACAGTGAAAC 


CTTTGCGTGA 


TAAATATCGA 


GTTGGTACTT 


CATCAACGGC 


10320 




TGCTAATCCT 


GTGAGAATTG 


CCAATATTTC 


GAATAATGCG 


ACAGTATCAC 


AAGCTGATCA 


10380 


5 


AACGACAATT 


ATTAATTCGT 


TAACGTTTAC 


TGAAACAGTA 


CCAAATAGAA 


GTTATGCAAG 


10440 




AGCAAGTGCG 


AATGAAATCA 


CTAGTAAAAC 


AGTTAGTAAT 


GTCAGTCGTA 


CTGGAAATAA 


10500 




TGCCAATGTg 


CACAGTAACT 


GTTACTTATC 


AAGATGGAAC 


AACATCAACA 


GTGACTGTAC 


10560 


10 


CTGTAAAGCA 


TGTCATTCCA 


GAAATCGTTG 


CACATTCGCA 


TTACACTGTA 


CAAGGCCAAG 


10620 




ACTTCCCAGC 


AGGTAATGGT 


TCTAGTGCAT 


CAGATTACTT 


TAAGTTATCT 


AATGGTAGTG 


10680 


15 


ACATTGCAGA 


TGCAACTATT 


ACATGGGTAA 


GTGGACAAGC 


GCCAAATAAA 


GATAATACAC 


10740 


GTATTGGTGA 


AGATATAACT 


GTAACTGCAC 


ATATCTTAAT 


TGATGGCGAA ACAACGCCGA 


10800 




TTACGAAAAC 


AGCAACATAT 


AAAGTAGTAA 


GAACTGTACC 


GAAACATGTC 


TTTGAAACAG 


10860 


20 


CCAGAGGTGT 


TTTATACCCA 


GGTGTTTCAG 


ATATGTATGA 


TGCGAAACAA 


TATGTTAAGC 


10920 




CAGTAAATAA 


TTCTTGGTCG 


ACAAATG CGC 


AACATATGAA 


TTTCCAATTT 


GTTGGAACAT 


10980 




ATGGTCCTAA 


CAAAGATGTT 


GTAGGCATAT 


CTACTCGTCT 


TATTAGAGTG 


ACATATGATA 


11040 


25 


ATAGACAAAC 


AGAAGATTTA 


ACTATTTTAT 


CTAAAGTTAA 


ACCTGACCCA 


CCTAGAATTG 


11100 




ACGCAAACTC 


TGTGACATAT 


AAAGCAGGTC 


TTACAAACCA 


AGAAATTAAA 


GTTAATAACG 


11160 




TATTAAATAA 


CTCGTCAGTA 


AAATTATTTA 


AAGCAGATAA 


TACACCATTA 


AATGTCACAA 


11220 


30 


ATATTACTCA 


TGGTAGCGGT 


TTTAGTTCGG 


TTGTGACAGT 


AAGTGACGCG 


TTACCAAATG 


11280 




GCGGAATTAA 


AGCAAAATCT 


TCAATTTCAA 


TGAACAATGT 


GACGTATACG 


ACG CAAGACG 


11340 


35 


AACATGGTCA 


AGTTGTTACA 


GTAACAAGAA 


ATGAATCTGT 


TGATTCAAAT 


GACAGTGCAa 


11400 


CAGTAACAGT 


GACACCACAA 


TTACAAGCAA 


CTACTGAAGG 


CGCTGTATTT 


ATTAAAGGTG 


11460 




GCGA&GTTT 


TGATTTCGGA 


CACGTAGAAA 


GATTTATTCA 


AAACCCGCCA 


CATGGGGCAA 


11520 


40 


CGGTTGCATG 


GCATGATAGT 


CCAGATACAT 


GGAAGAATAC 


AGTCGGTAAC 


ACTCATAAAA 


11580 


CTGCGGTTGT 


AACATTACCT 


AATGGTCAAG 


GTACGCGTAA 


TGTTGAAGTT 


CCAGTCAAAG 


11640 




TTTATCCAGT 


TGCTAATGCA 


AAGGCGCCAT 


CACGTGATGT 


GAAAGGTCAA 


AATTTGACTA 


11700 


45 


ATGGAACGGA 


TGCGATGAAC 


TACATTACAT 


TTGATCCAAA 


TACAAACACA 


AATGGTATCA 


11760 




CTGCAGCATG 


GGCAAATAGA 


CAACAACCAA 


ATAACCAACA 


AGCAGGCGTG 


CAACATTTAA 


11820 




ATGTCGATGT 


CACATATCCA 


GGTATTTCAG 


CTG CTAAACG 


AGTTCCTGTT 


ACTGTTAATG 


11880 


SO 


TATATCAATT 


TGAATTCCCT 


CAAACTACTT 


ATACGACAAC 


GGTTGGAGGC 


ACTTTAGGAA 


11940 




GTGGTACGCA 


AGCATCAGGA 


TATGCACATA 


TGCAAAATGC 


TACTGGTTTA 


CCAACAGATG 


12000 
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TGAATAAACC 


GAATGTGGCT 


AAAGTCGTTA 


ACGCAAAATA 


TGACGTCATC 


TATAACGGAC 


12120 




ATACTTTTGC 


AACATCTTTA 


CCAGCGAAAT 


TTGTAGTAAA 


AGATGTGCAA 


CCAGCGAAAC 


12180 


5 


CAACTGTGAC 


TGAAACAGCG 


G CAGGAGCGA 


TTACAATTGC 


ACCTGGAGCA 


AACCAAACAG 


12240 




TGAATACACA TGCCGGTAAC 


GTAACGACAT 


ACGCTGATAA 


ATTAGTTATT 


AAACGTAATG 


12300 


10 


GTAACGTTGT 


GACGACATTT 


ACACGTCGCA 


ATAATACGAG 


TCCATGGGTG 


AAAG AAG CAT 


12360 


CTGCAGCAAC 


TGTAGCAGGT 


ATTGCTGGAA 


CTAATAATGG 


TATTACTGTT 


GCAGCAGGTA 


12420 




CTTTCAACCC 


TGCTGATACA 


ATTCAAGTTG 


TTGCAACGCA 


AGGAAGCGGA 


GAGACAGTGA 


12480 


15 


GTGATGAGCA 


ACGTAGTGAT 


GATTTCACAG 


TTGTCGCACC 


ACAACCGAAC 


CAAGCGACTA 


12540 




CTAAGATTTG 


GCAAAATGGT 


CATATTGATA 


TCACGC CTAA 


TAATCCATCA 


GGACATTTAA 


12600 




TTAATCCAAC 


TCAAGCAATG 


GATATTGCTT 


ACACTGAAAA 


AGTGGGTAAT 


GGTGCAGAAC 


12660 


20 


ATAGTAAGAC 


AATTAATGTT 


GTTCGTGGTC 


AAAATAATCA 


ATGGACAATT 


GCGAATAAGC 


12720 




CTGACTATGT 


AACGTTAGAT 


GCACAAACTG 


GTAAAGTGAC 


GTTCAATGCC 


AATACTATAA 


12780 




AACCAAATTC 


ATCAATCACA 


ATTACTCCGA 


AAGCAGGTAC 


AGGTCACTCA 


GTAAGTAGTA 


12840 


25 


ATCCAAGTAC 


ATTAACTGCA 


CCGGCAGCTC 


ATACTGTCAA 


CACAACTGAA 


ATTGTGAAAG 


12900 




ATTATGGTTC 


AAATGTAACA 


GCAGCTGAAA 


TTAACAATGC 


AGTTCaAGTT 


GCTAATAAAC 


12960 




GTACTGCAAC 


GATTAAAAAT 


GGCACAGCAA 


TGCCTACTAA 


TTTAGCTGGT 


GGTAGCACAA 




30 


CGACGATTCC 


TGTGACAGTA 


ACTTACAATG 


ATGGTAGTAC 


TGAAGAAGTA 


CAAGAGTCCA 


13080 




TTTTCACAAA AGCGGATAAA 


CGTGAGTTAA 


T CACAG CTAA 


AAATCATTTA 


GATGATCCAG 


13140 


35 


TAAGCACTGA 


AGGTAAAAAG 


CCAGGTACAA 


TTACGCAGTA 


CAATAATGCA 


ATGCATAATG 


13200 




CGCAACAACA 


AATCAATACT 


GCGAAAACAG 


AAGCACAACA 


AGTGATTAAT 


AATGAGCGTG 


13260 




CAACACCACA 


ACAAGTTTCT 


GACGCACTAA 


CTAAAGTTCG 


TGCAGCACAA 


ACTAAGATTG 


13320 


40 


ATCAAGCTAA AGCATTACTT 


GAAAATAAAG 


AAGATAATAG 


CCAATTAGTA 


ACGTCTAAAA 


133B0 




ATAACTTACA 


AAGTTCTGTG 


AACCAAGTAC 


CAT CAACTG C 


TGGTATGACG 


CAACAAAGTA 


13440 




TTGATAACTA 


TAATGCGAAG 


AAGCGTGAAG 


CAGAAACTGA 


AATAACTGCA 


GCTCAACGTG 


13500 


45 


TTATTGACAA 


TGGCGATGCA 


ACTGCACAAC 


AAATTTCAGA 


TGAAAAACAT 


CGTGTCGATA 


13560 




ACGCATTAAC 


AGCATTAAAC 


CAAGCGAAAC 


ATGATTTAAC 


TGCAGATACA 


CATGC CTTAG 


13620 




AGCAAGCAGT 


GCAACAATTG 


AATCGCACAG 


GTACAACGAC 


TGGTAAGAAG 


CCGGCAAGTA 


13680 


SO 


TTACTGCTTA 


CAATAATTCG 


ATTCGTGCAC 


TTCAAAGTGA 


CTTAACAAGT 


GCTAAAAATA 


13740 




GCGCTAATGC 


TATTATTCAA 


AAGCCAATAA 


GAACAGTACA 


AGAAGTGCAA 


TCTGCGTTAA 


13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG 13920 

TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCGG 13980 

GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC 14 04 0 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

TTGATCAGCC AACGAGTACG ACTGGTATGA CAAGCGCATC TATTGCAGCA TTTAATGAAA 14 220 

AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAG CC TCACATCCAG 14280 

ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGC CGCTAAATCA GCACTTGATC 14340 

AAG CACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGCGAAA AATCAACTAC 144 00 

AACATAGTAT TGACACGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 144 60 

ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG 14520 

GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG 14580 

ATTTAGATCA TGCACGTCAA GCTTTAACAC CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 14640 

CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 14700 

TAAATGCGTA CAACCAAAAA TTACAAGCAG CGCGTCAAAA GTTAACTGAA ATTAATCAAG 14760 

TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 14 820 

CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AG AT AG A GAG CCAGCGTTAA 14 880 

CAACATTACA TGGTGCATCT AACTTAAACC AAGCACAACA AAATAATTTC ACGCAACAAA 14 940 

TTAATGCTGC TCAAAATcAT GctGCGCTTG AAACAATTAA GTCTAACATT ACGG CTTTAA 15000 

ATACTGCGAT GACGAAATTA AAAGACAGTG TTGCGGATAA TAATACAATT AAATCAGATC 15060 
AAAATTACAC TGACGCAACA CCAGCTAATA AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 
CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 15180 
AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG 15240 
CGAAAACAGA AGCAACAAAT GCGATTACGC ATG CAAGTG A TTTAAACCAA G C ACAAAAG A 153 00 
ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 15360 
AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACGTGGCGTT GCTAATCATA 15420 
ACCAAGTCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 15480 
ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT C CAGTT AT AA 1554 0 
CAC CAAGTG A TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 
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ATTTAAATAA TGCACAACGT CAAAACTTAC AATCGCAAAT TAATGGTGCG CATCAAATTG 15720 

ATGCAGTTAA TACAATTAAG CAAAATGCAA CAAACTTGAA TAGTGCAATG GGTAACTTAA 15780 

GACAAGCTGT TGCAGATAAA GATCAAGTGA AACGTACAGA AGATTATGCG GATGCAGATA 15840 

CAGCTAAACA AAATGCATAT AACAGTGCAG TTTCAAGTGC CGAAACAATC ATTAATCAAA 15900 

CAACAAATCC AACGATGTCT GTTGATGATG TTAATCGTGC AACTTCAGCT GTTACTTCTA 15960 

AT AAAAATG C ATTAAATGGT TATGAAAAAT TAG CACAATC TAAAACAGAT GCTGCAAGAG 16020 

CAATTGATGC ATTACCACAT TTAAATAATG CACAAAAAGC AGATGTTAAA TCTAAAATTA 16 080 

ATGCTGCATC AAATATTGCT GGCGTAAATA CTGTTAAACA ACAAGGTACA GATTTAAATA 1614 0 

CAkCGATGGg TAACTTGCAA GGTGCAATCA ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

ACTATCAAGA TGCGACACCT AGTAAGAAAA CAGCATACAC AAATGCGGTA CAAGCTGCGA 16260 

AAGATATTTT AAATAAATCA AATGGTCAAA ATAAAACGAA AGATCAAGTT ACTGAAGCGA 16320 

TGAATCAAGT GAATTCTGCT AAAAATAACT TAGATGGTAC G CGTTTATTA GATCAAGCGA 16380 

nCAAaCAGCA AAACAGCAGT TAAATAATAT GACGCATTTA ACAACTGCAC AAAAAACGAA 1644 0 

25 TTTAACAAAC CAAATTAATA GTGGTACTAC TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16500 

TGCCAATACA TTAGATCAAG CCATGAATAC GTTAAGACAA AGTATTGCCA ACAAAGATGC 16560 

GACTAAAGCA AGTGAAGATT ACGTAGATGC TAATAATGAT AAGCAAACAG CATATAACAA 1662 0 

30 CGCAGTAGCT GCTGCTGAAA CGATTATTAA TGCTAATAGT AATCCAGAAA TGAATCCAAG 16680 

TACGATTACA CAAAAAGCAG AGCAAGTGAA TAGTTCTAAA ACGGCACTTA ACGGTGATGA 16740 

AAACTTAGCT GCTGCAAAAC AAAATGCGAA AACGTACTTA AACACATTGA CAAGTATTAC 1680 0 



20 



~35~ 



40 



AGATGCTCAA AAGAACAATT TGATTAGTCA AATTACTAGT GCGACAAGAG TGAGTGGTGT 16860 

TGAXACTGTA AAACAAAATG CGCAACATCT AGAC CAAGCT ATGGCTAGCT TACAGAATGG 16 920 

TATTAACAAC GAATCTCAAG TGAAATCATC TGAGAAATAT CGTGATGCTG ATACAAATAA 16 980 

ACAACAAGAG TATGATAATG CTATTACTGC AG CGAAAGCG ATTTTAAATA AATCGACAGG 17040 

TCCAAACACT GCGCAAAATG CAGTTGAAGC AG CATTACAA CGTGTTAATA ATGCGAAAGA 17100 

TGCATTGAAT GGTGATGCAA AATTAATTGC AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TACTTTAACG CATATCACTA CAGCTCAACG TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

TACAAACTTA GCTGGTGTTG AATCTGTTAA ACAAAATGCG AATAGTTTAG ATGGTGCTAT 17280 

SO GGGTAACTTA CAAACGGCTA TCAACGATAA GTCAGGAACA TTAG CGAGCC AAAACTTCTT 17340 

GGATGCTGAT GAGCAAAAAC GTAATGCATA CAATCAAGCT GTATCAGCAG CCGAAACCAT 17400 
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TGTTAATAAT GCGAAACATG CATTAAATGG TACGCAAAAC TTAAACAATG CGAAACAAGC 17S20 

AG CG ATT AC A GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG ATGCATTAAA 17580 

AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC ACAATGCGAC 17640 

TGAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA CGAATACGTT 17700 

AGCAAGCAGT AAATATGTTA ATGCCGATAG CACTAAACAA AATGCTTACA CAACTAAAGT 17760 

TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC CTTCAGAAGT 17820 

AACAGCTGCA GCTAATCAAG TAAACAGCGC GAAACAAGAA TTAAATGGTG ACGAAAGATT 17 8 80 

ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT TAAATACACC 17 940 

TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG ACGTACAAAC 18000 

TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG ATAGTATTGC 18060 

TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA ATAACCAATC 18120 

AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATCCGAC 18180 

TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 18 24 0 

25 AAACGGTGCT GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT TAAATACATT 18300 

ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC GTGGAGGTCA 183 60 

TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC AAATGGGTAA 18420 

CTTGGAACAA GCTATCCATG ATCAAAACAC AGTTAAACAA AGTGTTAAAT TTACTGATGC 18480 

AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG CAATTCTGAA 18540 

TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC AAAATGTTTC 186 00 

AAGTG CT AAA AATGCATTGA ATGGTGATCA AAACGTTACA AATGCGAAGA ATGCAGCTAA 18660 

AAATGCATTA AATAACTTAA CGTCAATTAA TAATGCACAA AAA CGTG ACT TAACAACTAA 18720 

AATTGATCAA GCAACAACTG TAG CTGGTGT TGAAGCTGTA TCTAATACGA GTACACAATT 18780 

GAAtACAGCG ATGGCTAACT TGCAAAATGG TATTAATGAT AAAACAAATA CACTAG CAAG 1884 0 

TGAAAACTAT CATGATG CTG ATTCAGATAA GAAAACTGCT TAT ACT CAAG CCGTTACGAA 18900 

CGCAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG CCGTTGAAAA 18 960 

CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAAC CAT A ATTTAGAGCA 19020 

AGCTAAATCA AATGCAAACA CTAC TAT AAA CGGACTTCAA CATTTAACAA CTGCTCAAAA 19080 

50 AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT G CAGGTGT AG ATACTGTTAA 19140 

ATCAAGTGCC AACACATTAA ATGGTG CTAT GGGTACGTTA AGAAATAGCA TACAAGATAA 19200 
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TAACAATGCT GTTGATAGTG CTAATGGTGT CATTAATGCA ACAAGCAATC CAAATATGGA 19320 

TGCTAATGCA ATTAACCAAA TCGCTACACA AGTGACATCA ACGAAAAATG CATTAGATGG 19 3 80 

TACACATAAT TTAACGCAAG CGAAACAAAC AGCAACAAAT GCCATCGATG GTGCTACTAA 19440 

CTTAAATAAA GCGCAAAAAG ATGCGTTAAA AGCACAAGTT ACAAGTGCGC AACGTGTTGC 19500 

AAATGTAACA AGTATCCAAC AAACTGCAAA TGAACTTAAT ACAGCTATGG GTCAATTACA 19560 

A CATGGT ATT GATGATGAAA ATGCAACAAA ACAAACTCAA AAATATCGTG ACGcTGAACA 19620 

AAG T AAG AAA ACTGCTTATG ATCAAGCTGT AGCTGCTGCG AAAGCAATTT TAAATAAACA 19680 

AACAGGTTCA AATTCAGATA AAGCAGCAGT TGACCGTGCA TTACAACAAG TAACAAGTAC 1974 0 

GAAAGATGCA TTGAATGGTG ATGCAAAACT GGCAGAAGCG AAAGCGGCAG CTAAACAAAA 19800 

CTTAGGCACT TTAAACCATA TTACGAATGC ACAACGTACT GACTTAGAAG GCCAAATCAA 19 860 

TCAAGCGACG ACTGTTGATG GCGTTAATAC TGTAAAAACA AATGCCAATA CATTAGACGG 19920 

CGCAATGAAT AGCTTACAAG GTTCAATCAA TGATAAAGAT GCGACATTAA GAAATCAAAA 199 80 

TTATCTTGAT GCGGATGAAT CAAAACGAAA TGCATATACG CAAGCTGTCA CAGCGGCTGA 20040 

25 AGGCATTTTA AATAAACAAA CTGGTGGTAA CACATCTAAA GCAGACGTTG ATAATGCATT 20100 

AAATGCAGTT ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 2016 0 

AACTTCAGCA ACAAATACGA TTGATGGTTT ACCTAACTTA ACACAATTAC AAAAAGACAA 20220 

30 CTTGAAGCAT CAAGTTG Aa C AAGCGCAAAA TGTAGCAGGT GTAAATGGTG TTAAAGATAA 20280 

AGGTAATACG TTAAATACTG CCATGGGTGC ATTACGTACA AGTATCCAAA ATGATAATAC 20340 

GACGAAAACA AGTCAAAATT ATCTTGATGC ATCTGACAGC AACAAAAATA ATTACAATAC 20400 

~ TGCTG TAAAT ~AATG CAAATG~GTGTTATTAAT TG CAACG AAC — AATCCAAATAT TGGATGCTAA 20460 

TGCGATTAAT GGCATGGCAA ATCAAGTCAA TACAACAAAA GCAGCGTTAA ATGGTGCACA 20520 

AAACTTAGCT CAAGCTAAAA CAAATGCGAC GAACACAATT AACAACGCAC ATGACTTAAA 20580 

CCAAAAACAA AAAGATGCAT TAAAAACACA AGTTAACAAT GCACAACGTG TATcTGATGC 20640 

AAATAACGTT CAACACACTG CAACTGAATT GAACAGTGCG ATGACAGCAC TTAAAGCAGC 20700 

TATTGCTGAT AAAGAAAGAA CAAAAGCAAG CGGTAATTAT GTCAATGCTG ATCAAGAAAA 20760 

ACGTCAAGCG TATGATTCAA AAGTGACTAA CGCTGAAAAT ATCATTAGTG GTACACCGAA 20820 

TG CGACATT A ACAGTCAATG ACGTAAATAG TGCGGCATCA CAAGTCAATG CGGCTAAAAC 208 80 

60 AG CATT AAAT GGTGATAACA ACTTACGTGT AGCGAAAGAG CATGCCAACA ATACAATTGA 20940 

CGGCTTAGCA CAATTGAATA ATGCACAAAA AGCAAAATTA AAAGAACAAG TTCAAAGTGC 21000 
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GAAAGGCTTA AGAGATAGTA TTGCGAATGA 


> AGCAACAATT 


' AAAG CAGGTC 


' AAAACTACAC 


21120 




TGACGCAAGT CCAAATAATr 


vj 1 MlGAG T A 


CGACAGTGCA 


. GTTACTGCAG 


CAAAAGCAAT 


21180 


5 


CATTAATCAA ACATCGAACC 


AALVjlA 1 wwA 


ACCAAATACT 


ATTACGCAAG 


TAACATCACA 


21240 




AGTGACAACT AAAGAACAGG 


PLTTfi 7A ATVO 
V-Al lnflMl\jO 


TGCGCGAAAC 


TTAGCTCAAG 


CTAAGACAAC 


21300 


10 


TGCGAAAAAC AACTTGAATA 




AATTAACAAT 


GCACAAAAAG 


ATGCGTTAAC 


21360 


GCGTAg c ATT GATGGTGCAA 


*-~AAL_A w i Aw V- 


TGGTGTAAAT 


CAAGAAACTG 


CAAAAGCAAC 


21420 




AGAATTAAAT AACGCAATGC 


A T* A /**t h i"!' iv p iv 
A i ML> 111 ALA 


AAATGGTATC 


AATGATGAGA 


CACAAACAAA 


21480 


15 


ACAAACTCAG AAATACCTAG 


A TViP R P R p 
A 1 wUALiAO C C 


AAGTAAGAAA 


TCAGCTTATG 


ATCAAGCAGT 


21540 


AAATGCAGCG AAAGCAATTT 


lAftLAAAAbL 


TAGTGGTCAA 


AATGTAGACA 


AAGCAG CAGT 


21600 




TGAACAAGCA TTPPAA a atp 


I wAACAGTAC 


GAAGACjGGCG 


TTGAACGGTG 


ATGCGAAATT 


21660 


20 


AAATGAAGCT AAAGCAGCTH 


t-OAAALAAAC 


GTTAGGTACA 


TTAACACACA 


TTAATAATGC 


21720 




ACAACGTACA f5Cf?TTAnAra 


TV T , Z'** 7\ IV fv ^T—f * ■» 

A 1 w AAATTAC 


ACAAGCAACA 


AATGTTGAAG 


GTGTTAATAC 


21780 




AGTTAAAGPP AAArtPP.PAAr' 


AATTAGATGG 


TGCTATGGGT 


CAATTAGAAA 


CATCAATT CG 


21840 


25 


TGAT AAAG A P A PP. A rnTT zv 


AAAGTCAAAA 


TTATCAAGAT 


GCTGATGATG 


CTAAACGAAC 


21900 




TGCTTATTCT CAACf AP.T A A 


A 1 bCAGCAG C 


AACTATTTTA 


AATAAAACAg CTGG CGGTAA 


21960 




T A CA C CT AAA PrPAP^ATYlTTr* 


TV TV A JV 5v iv m 

AAAtj AG CAAT 


GCAAGCTGTT 


ACACAAGCAA 


ATACTG cATT 


22020 


30 


AAACGGTATT CAmAACTTAG 


Al \_w 1 wV-VjAA 


ACArGCTGCT 


AACACAGCGA 


TT ACAAATG C 


22080 




TTCGGACTTA AATACAAAAC 


nVAAAAvjAAvjL 


ATTAAAAgCA 


CAAGTAACAA 


GTG CAGGACG 


22140 


35 


TGTATCTGCA GCAAATHHTr: 


I IvjAACATAC 


TGCGACTGAA 


TTAAATACTG 


CGATGACAGC 


22200 


TTTAAAGCGT G C CATTP, PTO 


A *T* ZV TA T\ Z*' PTV' TV. 


GACAAAAGCT 


AGTGGTAACT 


ATGTCAATGC 


22260 




TGATCTCGAAT AAACGT CAAG 


P A *T* A TP TA TP 7v 
LA 1 A 1 A 1 VjA 


AAAAGTTACA 


GCTGCCGAAA 


ATATCGTTAG 


22320 


40 


TGGTACACCA APAPPA A PPT 


T> TV tv p TV r*pjvnr» 

1 AALACCAGC 


AGATGTTACA 


AATGCAGCAA 


CGCAAGTAAC 


22380 


GAATGCTAAG ACGPAPTTA A 


A f^nr^T* A TV TP TV 

ALvAj l AATCA 


TAATTTAGAA 


GTAGCGAAAC 


AAAATGCTAA 


22440 




CACTGCAATT GATfVSTTTA A 


CTTCTTTAAA 


TGGTCCGCAA 


AAAGCAAAAC 


TTAAAGAACA 


22500 


45 


AGTGGGTCAA GCGACGACGT 


TGCCAAATGT 


TCAAACTGTT 


CGTGATAATG 


CACAAACATT 


22560 




AAACACTGCA ATGAAAGGTC 


TACGAGATAG 


CATTGCGAAT 


GAAGCAACGA 


TTAAAGCAGG 


22620 




TCAAAACTAC ACAGATGCAA 


GTCAAAACAA 


ACAAACTGAC 


TACAACAGTG 


CAGTCACTGC 


22680 


50 


AGCAAAAGCA AT CATTGGT C 


AAACAACTAG 


TCCATCAATG 


AATGCGCAAG 


AAATTAATCA 


22740 




AGCGAAAGAC CAAGTGACAG 


CTAAACAACA 


AGCGTTAAAC 


GGTCAAGAAA 


ACTTAAGAAC 


22800 
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AGATGCAGTG AAACGTCAAA TCGAAGGTGC AACGCATGTT AATGAAGTAA CACAAGCACA 22 920 

AAATAATGCG GATGCaTTAA ATACAGCTAT GACGAACTTG AAAAATGGTA TTCAAGATCA 22 980 

GAATACGATT AAGCAAGGTG TTAACTTCAC TGATGCCGAC GAAGCGAAAC GTAATG CATA 23 040 

TACAAATGCA GTGACGCAAG CTGAACAAAT TTTAAATAAA GCACAAGGTC CAAATACTTC 23100 

AAAAGACGGT GTCGAAACTG CGTTAGAaAA TGTACAACGT GCTAAAAACG AATTGAACGG 23160 

TAATCAAAAT GTTGCGAACG CTAAGACAAC TGCGAAAAAT GCATTGAATA AC CTAACATC 23220 

AATTAATAAT GCACAAAAAG AAGCATTGAA ATCACAAATT GAAGGTGCGA CAACAGTTGC 23280 

AGGTGTAAAT CAAGTGTCTA CAACGGCATC TGAATTAAAT ACAGCAATGA GCAACTTACA 23340 

AAATGGTATT AATGATGAAG CAGCTACAAA AGCAGCGCTT AATGGTACTC AAAACCTTGA 234 00 

AAAAGCTAAA CAACACGCAA ATACAGCAAT TGACGGTTTA AGCCATTTAA CAAATGCACA 23460 

AAAAGAGGCA TTAAAACAAT TGGTACAACA AT CG ACT ACT GTTGCAGAAG CACAAGGTAA 23 520 

TGAGCAAAAA GCAAACAATG TTGATGCAGC AATGGACAAA TTACGTCAAA GTATTGCAGA 23 580 

TAATGCGACA ACAAAACAAA ACCAAAATTA TACTGATGCA AGTCAGAATA AAAAGGATGC 23 64 0 

GTACAATAAT GCTGTCACAA CTGCACAAGG TATTATTGAT CAAACTACAA GTCCAACTTT 23700 

AGATCCGACT GTTATCAATC AAGCTGCTGG ACAAGTAAGC ACAACTAAAA ATGCATTAAA 23760 

TGGTAATGAA AACCTAGAGG CAGCGAAACA ACAAGCGTCA CAATCATTAG GTTCATTAGA 23820 

TAACTTAAAT AATGCGCAAA AACAAACAGT TACTGATCAA ATTAATGGCG CG CAT ACTGT 23 880 

TGATGAAGCA AATCAAATTA AGCAAAATGC GCAAAACTTA AATACAGCGA TGGGTAACTT 23 940 

GAAACAAGCG ATAGcTGACA AAGATGCTAC GAAAGCGACA GTTAACTTCA CTGATGCAGA 24 000 

TCAAGCAAAA CAACAAGCAT ATAACaCTGC TGTTACAAAT GCTGAAAATA TCATTTCAAA 24 060 

AGCTAATGGC GGCAATGCAA CACAAGCTGA AGTTGAACAA GCAATCAAAC AAGTTAATGC 24120 

TGCAAAACAA GCATTAAATG GTAATGCCAA CGTT CAACAT GCAAAAGACG AAGCAACAGC 24180 

ATTAATTAAT AGCTCTAATG ACCTTAACCA AGCACAAAAA GACGCATTAA AACAACAAGT 24240 

TCAAAATGCA ACTACTGTAG CTGGTGTAAA CAATGTTAAA CAAACAGCAC AAGAGTTAAA 24 300 

CAATGCTATG ACACAATTAA AACAAGGCAT TGCAGATAAA GAACAAACAA AAGCTGATGG 24360 

TAACTTTGTC AATGCAGATC CTGATAAGCA AAATG CAT AT AATCAAGCAG TAGCGAAAGC 244 20 

TGAAGCATTA ATTAGTGctA CGCCTGATGT TGTCGTTACA C CT AG CG AAA TTACTG CAGC 24480 

GTTAAATAAA GTTACGCAAG CTAAAAATGA TTTAAATGGT AATACAAACT TAGCAACGGC 24540 

GAAACAAAAT GTTCAACATG CTATTGATCA ATTGCCAAAC TTAAACCAAG CGCAACGTGA 24600 
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AGCGGCGACA ACGCTTAATG ACGCGATGAC ACAATTGAAA CAAGGTATTG CGAATAAAGC 
ACAAATTAAA GGTAGCGAGA ACTATCACGA TGCTGATACT GACAAGCAAA CAG CATATG A 
TAATGCAGTA ACAAAAGCAG AAGAATTGTT AAAACAAACA ACAAATCCAA CAATGGATCC 
AAATACAATT CAACAAGCAT TAACTAAAGT GAATGACACA AATCAAGCAC TTAACGGTAA 
TCAAAAATTA GCTGATGCCA AACAAGATGC TAAGACAACA CTTGGTACAC TAGATCATTT 
AAATGATGCT CAAAAACAAG CG CTAACAAC TCAAGTTGAA CAAGCACCAG ATATTG CAAC 
AGTTAATAAT GTTAAGCAAA ATGCTCAAAA TCTGAATAAT GCTATGACTA ACTTAAACAA 
TGCATTACAA GATAAAACTG AGACATTAAA TAGCATTAAC TTTACTGATG CAGATCAAGC 
TAAGAAAGAT GCTTATACTA ATGCGGTTTC ACATGCAGAA GGTATTTTAT CTAAAGCAAA 
TGGCAGCAAT GCAAGTCAAA CTGAAGTGGA ACAAGCGATG CAACGTGTGA ACGAAGCGAA 
ACAAGCATTG AATGGTAATG ACAATGTACA ACGTGCAAAA GATGCAGCGA AACAAGTGAT 



TACAAATGCA AATGATTTAA ATCAAGCAAT GACACAATTG AAACAAGGTA TTGCAGATAA 
AGACCAAACT AAAGCAAATG GTAACTTTGT CAATGCTGAT ACTGATAAGC AAAATGCTTA 
25 CAACAATGCG GT AG CACATG CTGAACAAAT AATTAGTGGT ACACCAAATG CAAACGTGGA 

TCCACAACAA GTGGCTCAAG CGTTACAACA AGTGAATCaA GCTAAGGGTG ATTTAAACGG 
TAACCATAAC TTACAAGTTG CTAAAGACAA TGCAAATACA G CCATTG AT C AGTTACCAAA 
CTTAAATCAA CCACAAAAAA CAGCATTAAA AGAC CAAGTG TCGCATGCAG AACTTGTTAC 
AGGTG TTAAT GCTATTAAGC AAAATGCTGA TGCGTTAAAT AATGcAATGG GTACATTGAA 
ACAACAAATT CAAGCGAACA GTCAAGTACC ACAGTCAGTT GACTTTACAC AAGCGGATCA 
AGACAAACAA CAAGCATATA ACAATGCGGC TAACCAAGCG CAACAAATCG CAAATGGCAT 
ACCAACACCT GTATTGACGC CTGATACAGT AACACAAGCA GTGACAACTA TGAATCAAGC 
GAAAGATGCA TTAAACGGTG ATGAAAAATT AGCACAAGCG AAACAAGAAG CTTTAGCAAA 
TCTTGATACG TTACGCGATT TAAATCAACC ACAACGTGAT GCATTACGTA ACCAAATCAA 
TCAAGCACAA GCGTTAGCTA CAGTTGAACA AACTAAACAA AATGCACAAA ATGTGAATAC 
aGCaATGAGT AACTTGAAAC aAGGTATTGC aAACAAAGAT ACTGTCAAAG CAAGTGAGAA 26160 
CTATCATGAT GCTGATGCCG ATAAGCAAAC AGCATATACA AATGCAGTGT CTCAAGCGGA 
AGGTATTATC AATCAAACGA CAAATCCAAC GCTTAACCCA GATGAAATAA CACGTGCATT 
50 AACT CAAGTG ACTGATG CTA AAAATGGCTT AAACGGTGAA GCTAAATTGG CAACTGAAAA 
GCAAAATGCT AAAGATGCCG TAAGTGGGAT GAOGCATTTA AACGATGCTC AAAAACAAGC 



25380 
25440 
25500 
25560 
25620 
25680 
25740 
25800 
25860 
25920 
25980 
26040 
26100 



26220 
26280 
26340 
26400 
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10 



15 



20 



25 



30 



-35- 



40 



45 



SO 



AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG AT AAAGCT CA 26520 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 26580 

GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 26640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 26700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 26760 

TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCGCCACTTG TTACAGATGT 26 820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 2 6880 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAAG 26 940 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAACGACA CAAGTGAATA ATACGAAAGT 27 060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA 27120 

T CAATT AG AT CATTTGAATA ATG CGCAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 27180 

ATCTGATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTG CGAT 27240 

GGGTAACTTA ATTAATGCGA TTG CAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 27360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 27420 

AGTTCAAACA ACACTTCAAG CGTTAAATGG AGAC CAT AAT TTACAAGTTG CTAAAACAAA 274 80 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCAAAAAA CAGCATTAAA 27540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 27600 



TACGCTTAAC CAAGCAATGC ATGGTTTAAG ACAGAGCATT CAAGATAACG CAGCAACTAA 27660 

AGCAAAT AG C AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG ATCAAGCTGT 27720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27780 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 2784 0 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27900 

AAAACATATG GAAGATACGT TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 27960 

TTTGACTGAA G CACAAG CAT TAGATCAACT TATGGATGCA TTACAACAAA GTATTGCTGA 28020 

CAAAGATGCA ACACGTG CG A GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 28080 

CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 28140 

CAATAAAGGT AATGTATCAA GTGCGACTCA AG CAGT AATA TCATCTAAAA ATG CATT AGA 28200 
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TCAATTAACA 


. CCAGCTCAAC 


AACAAGCGCT 


AGAAAATCAA 


ATTAATAATG 


CAACAACTCG 


28320 




TGATAAAGTG 


GCTGAAATCA 


TTGCACAAGC 


GCAAg CAT t A AATGAAGCGA 


TGAAAGCATT 


28380 


5 


AAAAGAAAGT 


ATTAAGGATC 


AACCACAAAC 


TGAAGCAAGT 


AGTAAATTTA 


TTAACGAGGA 


28440 




TCAAGCGCAA AAAGATGCTT 


ATACGCAAGC 


AGTACAACAC 


GCGAAAGATT 


TGATTAACAA 


28500 


10 


AACAACTGAT 




CTAAATCAAT 


CATTGATCAA 


G CG ACACAGG 


CAGTGACAGA 


28560 


TGCTAAAAAC 


R ATTTfiPa TV** 


GTGATCAAAA 


ACT AG CT CAA 


GATAAGCAAC 


GTGCAACAGA 


28620 




AACG TTAAAT 


a a r^i w rr t ^t* a 

^VrtL. X HalUi A 


ACTTGAATAC 


ACCACAACGT 


CAAGCACTTG 


AAAATCAAAT 


28680 


15 


TAATAATGPA 


czc* a a r*rppTv 


GCGAAGTAGC 


ACAAAAATTA 


ACTGAAGCAC 


AAGCACTTAA 


28740 






(jAACjC JTl'AC 


GTAATAGCAT 


TCAAGATCAA 


CAGCAAACGG 


AAGCGGGTAG 


28600 




f* B IV f"2*P*T ,r P ATf 
wviy 1. x X t\ 1 \— 


AA IvjAAG ATA 


AaCCaCmAAA 


AG rTGCTTAC 


CAAGCAGCAG 


TTCAAAATGC 


28860 


20 




A I IAATCAAA 


CTAACAATCC 


AACG CTTGAT 


AAAGCACAAG 


TTGAACAATT 


28920 






<j 1 1 AAC CAAG 


CTAAAGATAA 


CCTACACGGT 


GATCAAAAAC 


TTGCAGACGA 


28980 




T A A A ("^ A. ApuT 


vjL.Cj<j I X ACTG 


ATTTAAATCA 


ATTAAATGGT 


TTGAATAATC 


CGCAACGTCA 


29040 


25 


AGCAPTTfiAA 


A CC^ A A A A A 

nOLLAAAIAA 


ACAACGCAGC 


AACTCGTGG C 


GAAGTAGCAC 


AAAAATTAGC 


29100 




*IY5 a Af^r* a & & A 


uUtat- 1 xGATC 


AAGCAATG CA 


AG CATTACGT 


AATAGTATTC 


AAGATCAACA 


29160 




— r\j-\j-\.v^/\yjj-> ) j~\ 


1 iob 1 AOCA 


AGTTTATCAA 


TGAAGATAAA 


CCGCAAAAAG 


ATG CTTACCA 


29220 


30 


«U ^>/l\j WlVj X X 




AAGATTTAAT 


TAACCAAACA 


GGTAATCCAA 


CACTCGACAA 


29280 




AT CACAAGT A 


A A^a a ttt a 


CACAAGCAGT 


AACAACTGCA 


AAAGATAATC 


TACATGGTGA 


29340 


35 


TCAAAAA <"T 1 ' 


ijLlLuHjAlt. 


AACAACAAGC 


AGTAACAACT 


GT AAATG CAT 


TGCCAAACTT 


29400 


AAATCATGCA 




CATTAACTGA 


TGCTATAAAT 


GCAGCGCCTA 


CAAGAACAGA 


29460 




vjvj J. ivjwiv^u\ 


/■"A TV^T , *P/^' A A a 


CTGCTACTGA 


ACTTGATCAC 


GCGATGGAAA 


CATTGAAAAA 


29520 


40 


TAAAGTTGAT 


CAAGTGAATA 


CAGATAAGGC 


TCAACCAAAT 


TACACTGAAG 


CGTCAACTGA 


29580 




TAAAAAAGAA 


GCAGTAGATC 


AAG CGTTACA 


AGCTGCAGAA 


AGCATTACAG 


ATCCAACTAA 


29640 




TGGTTCAAAT 


GCGAATAAAG 


ACGCTGTAGA 


CCAAGTATTA 


ACTAAGCTTC 


AAGAAAAAGA 


29700 


45 


AAATGAGTTA 


AATGGTAATG 


AGAGAGTCGC 


TGAAGCTAAA 


ACACAAGCGA 


AACAAACTAT 


29760 




TG AC CAATT A 


ACACATTTAA 


ATGCTGATCA 


AATTGCAACT 


GCTAAACAAA 


ACATTGATCA 


29620 




AG CG ACGAAA 


CTTCAACCAA 


TTGCTGAATT 


AGTAGATCAA 


GCAACGCAAT 


TGAAT CAATC 


29880 


SO 


TATGGATCAA 


TTACAACAAG 


CAGTTAATGA 


ACATGCTAAC 


GTTGAGCAAA 


CTGTAGATTA 


29940 




CACACAAGCA 


GATTCAGATA 


AACAAAATGC 


TTATAAACAA 


GCTATTGCTG 


ATGCTGAAAA 


30000 
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TGCAAAACAA 


GCATTAAATG 


GTGATGAACG 


TGTAGCACTT 


GCTAAAACAA 


ATGGTAAAC-A 


3012 0 




TGACATCGAC 


CAATTGAATG 


CATTAAACAA 


TGCTCAACAA 


GATGGATTTA 


AACKji V-CjCJA 1 


30180 


5 


CGATCAATCA 


AACGATTTAA 


AT CAAATC CA 


ACAAATTGTA 


GATGAGGCTA 


TV /"*/""■ f TV PTT TV 7A 


3024 0 




TCGTGCAATG 


GATCAATTGT 


CACAAGAAAT 


CACTGACAAT 


GAAGGACGCA 


CGAAACjO I AU 


1 A *5 ft A 




CACGAACTAT 


GTCAATGCAG 


ATACACAAGT 


CAAACAAGTA 


TATGATGAAA 


LajCj i J/CiAlAA 


rt "S C C\ 

J 03 6 0 


10 


AGCGAAACAA 


GCACTTGATA 


AATCGACTGG 


TCAAAACTTA 


ACTGCAAAAC 


TV » pmrp* T/"« 7V TV 

AALj I i A 1 LAA 






ATTAAATGAT 


GCAGTCACTG 


CAGCTAAGAA 


AGCATTAAAT 


GGTGAAGAAA 


/— t » A^nwit TV TV TV TV 

G AC l 1 AA I AA 


3 04 8 0 


15 


TCGTAAAGCT 


GAAGCATTAC 


AAAGATTGGA 


TCAATTAACA 


CAT CTAAACA 


TV rr*f* /"•n"*/"* TV TV TV /"» 

ATGu I LAAAb 




ACAATTAGCA 


ATCCAACAAA 


TTAATAATGC 


TGAAACGCTA 


■» 4k m 4k 4k 4k /"4^| 4k in 

AAT AAAG CAT 


/ KH/^n TV /** TV TV 

CTCG AG CAAT 


3 0 bUU 




TAATAGAGCA 


ACTAAATTAG 


ATAATGCAAT 


GGGTTCAGTA 


CAACAATATA 


TTGACGAACA 


iObbU 


20 


GCACCTTGGT 


GTTATCAGCA 


GCACAAATTA 


CATCAATGCA 


GATGACAATT 


r«y-l « ft • /■» TV TV IV 

TGAAAGCAAA 


*5 C\ *T ^ rt 

3 0720 




TTATGATAAT 


GCAATTG CG A 


ATGCAGCACA 


TGAGTTAGAT 


AAAGTGCAAG 


GTAATGCAAT 


o r\ o rt 
3 0/80 




TGCaAAAGCT 


GAAGCAGAGC 


AATTGAAACA 


AAATATTATC 


4k .^#4>4/^^ 4k 4k 4k 

GATGCTCAAA 


TV TV* TV »I*H » TV TV TV 

ATG CATTAAA 


3 084 0 


25 


TGGAGACCAA 


AACCTTGCAA 


ATGCCAAAGA 


TAAAGCAAAT 


GCGTTTGTTA 


ATTCGTTAAA 


30900 




TGGATTAAAT 


CAACAGCAAC 


AAGATCTTGC 


ACATAAAGCA 


TV « I T TV TV ^ TV 7\ *T*/~* 


/TV* t\t»7a /" ,r Pf" ,r P 
CCViA I Av_ 1 vj 1 


jU7 Dv 




ATCAGATGTA 


ACAGATATTG 


TTAATAATCA 


AATTGACTTA 


AATGATGCAA 


TGGAAACATT 


31020 


30 


GAAACATTTA 
TGACGATAAT 


GTTGACAATG 
GCTAAA 


AAATTCCAAA 


TGCAGAGCAA 


ACTGTCAATT 


ACCAAAACGC 


31080 
31096 




(2) INFORMATION FOR SEQ ID NO: 60: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



45 



ATGACAGAAT 


GGGAGCGAGG 


ACTTAGAATG 


TTTCCTAAAT 


CAGGTTTATT 


AAATTTTGAG 


60 


TTAGCGATAG 


mAAATCGTTC 


ATTAAATGAT 


GATGAAAAAG 


CATTAAAATA 


TGTGCGTAAA 


120 


GCATTAAATG 


CAGACCCTAA 


AAATACAGAT 


TATATTAACT 


TAGAAAAAGA 


GTTGACTAAA 


180 


TCAAATGAGT 


CGAAAAATAA 


ATAACTTTTA 


TGATGTACAA 


CAGTTATTGA 


AAAGTTACGG 


240 


ATTTCTAATA 


TATTTTAAAA 


ATCCAGAAGA 


TATGTACGAA 


ATGATTCAAC 


AGGAGATTTC 


300 
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TAATCAGAGA AGGAATGAAC AGAAATGACA 
ACGACTTGTA AATTAGGTAT TTTCACACCT 

5 

CACACTGATA CATCTGATAG TACAGGATAT 
GTTGAAAAAG TAAATGAAAA TAATTATAAT 
GTACCAGGTC CTGTTGACTT TGAAAAAGGT 

10 

CCAGAAAAAG TTAATGTACG TGAGATTTTT 
GATAATGATG CTAACATAGC TGCTTTAGGG 

?5 GATGATGTTG TTGCCATCAC ACTTGGTACA 
TGAAATCGTA CATGGTCATA ATGGCTCtGG 
CgATCAACGA TTTaAATGTA ATTGTGGTCG 

20 GACAGGCGTT GTTAACTTAG TT AACTTC t A 
ATTAGAATTG ATTAAAGAAA ATAAGGTtAC 
TGGTGACCAA TT CTGTATTT TCATTACTGA 

25 TAGTATTATT AGTGTTACAA GTAATCCGAA 
TGCAGGACCT ATTTTAATTG AAAATATTAA 
TGCTCAATTT GAAACTGAAA TTGTACAAGC 

30 

AGCAGCAGGA TTAATCAAGA C CTATGT ATT 
GTTGATGTGG TTGTTATTCC AGTTGGAACG 
GAT ATT CAGA AAAAACTTCA AGAATATAAA 

35 

CCAATGAATA CTCTAATTGA AGGTGAATTA 
CATGAATTAC CTTTTGATAA AGGTTTAAGT 
CGACGAGACA AATCTAGAAA AATGAATGAT 

40 

AATAGTGGTG AAAACCTATG AGGATTTCAA 
CGTATTTCAT CGAAAATGAC AAAG CTGTT A 

45 AAATTATTAA AAAATTAAAC CAAATAAATA 
CACACTTTGA TCATATCGGA GCAGTCGATG 
ATATGCATGA AGCAGAGTTT GATTTTCTAA 

50 TTAAGCAATA TGGATTACCA ATTATTACAA 
GT AG CACAG A AATAGAAGGA TTTAAGTTnT 
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AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

GAATTAGAAC AATTACATAA ATGGTCTATT 4 80 

ACACTTTTGA AAGGAATTTA TGATTCGTTT 540 

TTTTCAAATG TACTTGGCGT AGGTATTGGT 600 

ACAGTAAATG GAG CAGT AAA CTTATATTGG 66 0 

GAACAATTCG TTGATTGTCC AGTGTATGTA 720 

GaGAAACACA AAGGTGCTGG TGAAGGTGCC 78 0 

GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

CTATCCGAAG TTGACGTTTA GATCTTCTAT 102 0 

aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 108 0 

AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

ATATATCGTT CTAGGTGGAG GAATGTCTAC 120 0 

AACAGAATAT CATAATTTAA CATTTGCACC 1260 

GAAATTAGGT AATGATGCAG GTATTACAGG 132 0 

AGATAAAGAG GGGGTAAAAT AATGGCTATT 13 8 0 

GAAGGTCCGA GTGTTAGTAA ATATATTGCA 144 0 

GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

AAACTAACAT CAGTACAAAA ACATTTAGAA 168 0 

GCTTAACTTT AGGCTTAGTT GATACTAATA 174 0 

TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 1800 

AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

ATATAGTTGA TCGATTCGAT GTCCCGGTTT 192 0 

AAGATCCCGT TAAAAATGGG GCAGATAAAT 198 0 

GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

nAyrTGTaCA CACACCTGGA CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 2220 

ATAAAATATT TGAATTAGAA GGC 224 3 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



is 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

TTGGnATCAT tyAcgGTAAA AAGAATAAaG CAAG ATT t AT TTCATTAGTA CTAATTTGTG 60 

CAATGTTTGC AATTTGTTGG GTTG CAT ATA TTCAATGGGA GTCTACAATC GCTTCATTTA 12 0 

CACAATCTAT TAATATTTCa ATGG CACAAT ATAGTGTTTT ATGGACAATT AACGGAATAA 180 

TGATTTTAGT AG CACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 24 0 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC ACGAGTTTTG 3 00 

25 CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AACTTTTGGA GAAATGTTTG 360 

TATGGCCAGC AGTTCCAACT ATAGCCAATC AGTTAGCGCC AGATGGTAAG CAAGGACAGT 4 20 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 4 80 

30 GTGTATTAGT TGATGCGTTT AATATG CGCA TGATGTTTAT CGGTATGATG CTACTACTTG 540 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 600 

ATG CAT AATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 660 

_35 = — 

ATATTAATTT GTATAATTTA ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 780 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 84 0 

GTCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 900 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 960 

ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 1020 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 1080 

TT CAAG ATAT AAAAGAATGC AAGGATATAA TGTATTACAT CCGATGGGGT GGGATGCATT 1140 

SO CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 1200 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCAGTT ATGATTGGGA 1260 
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GTTATATAAC 


AAAGGTTTAG 


CATACGTTGA 


TGAAGTTGCA 


GTTAACTGGT 


GTCCAGCATT 


1380 




AGGCACTGTT 


TTATCTAACG 


AAGAAGTGAT 


TGATGGTGTC 


TCTGAACGTG 


GTGGACATCC 


1440 


5 


AGTTTATCGT 


AAGC CGATG A 


AACAATGGGT 


ACTTAAAATC 


ACAGAATATG 


CAGATCAATT 


1500 




ATTAGCAGAT 


TTAGATGATT 


TAGATTGGCC 


TGAGTCTTTA 


AAAGATATGC 


AGCG CAATTG 


1560 


10 


GATTGGACGT 


TCTGAAGGGG 


CCAAAGTTTC 


ATTTGATGTA 


GATAATACGG 


AAGGAAAAGT 


1620 


AGAAGTATTT 


ACGACTAGAC 


CAGATACAAT 


CTATGGTGCA 


TCATTCTTAG 


TCTTAAGTCC 


1680 




TGAACATGCA 


TTAGTTAATT 


CAATTACAAC 


AGATGAATAT 


AAAGAAAAAG 


T AAAAG CTTA 


1740 


15 


TCAAACAGAA 


GCTTCTAAAA 


AG T CAGATTT 


AGAACGTACA 


GATTTAGCAA 


AAGATAAATC 


1800 




AGGTGTATTT 


ACTGGTGCAT 


ATGCAACTAA 


TCCTTTATCT 


GGTGAAAAAG 


TACAAATTTG 


1860 




GATTGCTGAT 


TATGTATTAT 


CAACATATGG 


TACTGGAGCA 


ATTATGGCAG 


TACCAGCGCA 


1920 


20 


TGATGAGAGA 


GATTATGAAT 


TTGCTAAAAA 


GTTTGATTTG 


CCAATCATTG 


AAGTCATCGA 


1980 




AGGTGGAAAT 


GTTGAAGAAG 


CAGCATACAC 


TGGTGAAGGT 


AAACATATTA 


ATTCTGGTGA 


2040 




ACTTGATGGT 


TTAGAAAATG 


AAGCGGCAAT 


TACTAAAGCT 


ATTCAATTAT 


TAGAGCAAAA 


2100 


25 


AGGTGCTGGC 


GAAAAGAAAG 


TTAATTACAA 


ATTAAGAGAT 


TGGTTATTCA 


GTCGTCAGCG 


2160 




TTATTGGGGC 


GAACCAATTC 


CTGTCATTCA 


TTGGGAAGAT 


GGAACAATGA 


CAACTGTTCC 


2220 




TGAAGAAGAG 


CTACCATTGT 


TGTTACCTGA 


AACAGATGAA 


ATCAAGCCAT 


CAGGGACTGG 


2280 


30 


TGAGTCTCCA 


CTAGCTAATA 


TTGATTCATT 


TGTAAATGTT 


GTAGATGAAA 


AAACAGGTAT 


2340 




GAAAGGACGT 


CGTGAAACAA 


ATACAATGCC 


ACAATGGGCA 


GGT AG TTGTT 


GGTATTATTT 


2400 


35 


ACGTTACATC 


GATCCTAAAA 


ATGAAAATAT 


GTTAGCAGAT 


CCTGAAAAAT 


TAAAACATTG 


2460 


GTTACCTGTT 


GATTTATATA 


TCGGTGGAGT 


AGAACATGCG 


GTTCTTCACT 


TATTATATGC 


2520 




AAGATTTTGG 


CATAAAGTCC 


TTTATGATTT 


GGCTATCGTA 


CCTACTAAAG 


AACCTTTCCA 


2580 


40 


AAAATTATTT 


AACCAAGGTA 


TGATTTTAGG 


AGAAGGTAAT 


GAGAAGATGA 


GTAAATCTAA 


2640 




AGGAAATGTA 


ATCAATCCTG 


ATGATATAGT 


ACAGTCTCAT 


GGTGCAGATA 


CTTTGCGTCT 


2700 




TTACGAAATG 


TTTATGGGAC 


CTTTAGATGC 


TGCAATTGCA 


TGGAGTGAAA 


AAGGATTAGA 


2760 


45 


TGGGTCTCGT 


CGATTCTTAG 


ATCGCGTATG 


GCGTTTAATG 


GTAAATGAAG 


ATGGGACATT 


2820 




GAGTTCAAAA 


ATTGTAACTA 


CAAATAATAA 


ATCTTTAGAT 


AAAG TTTAT A 


ACCAAACTGT 


2880 




TAAAAAGGTA 


ACAGAAGACT 


TTGAAACATT 


AGGATTTAAT 


ACTGCTATTA 


GTCAATTAAT 


2940 


50 


GGTATTTATT 


AATGAGTGTT 


ATAAAGTTGA 


TGAAGTTTAT 


AAACCTTACA 


TTGAAGGCTT 


3000 




CGTTAAAATG 


TTAGCACCTA 


TTGCACCACA 


TATCGGTGAA 


GAATTATGGT 


CAAAATTAGG 


3060 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTG CCTTA TCTAATGACA ATGTTAAAGC 324 0 

GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 3300 

TGTAG CT AAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 3360 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 34 20 

AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 34 80 

TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTG CTGGTG GAGTTCGAAG 3 54 0 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG 3 600 

CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 3 660 

TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAG CGTT CTTGTTATGT TTTATTAAGA 3780 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 3 84 0 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 3 900 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 3 960 

GAGAACAGAC AACCAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 402 0 

TAGGAAGTAT AAAGTAGGCA TATTCTCTAC TTTAATCGGA ACAGTTTTAT TACTTTCAAA 4 080 

30 CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAGCGATA CTAATCAAGC 414 0 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG CAAATAGTGC 4200 

GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAAT CAAG CATTGGTTAA 4 260 

-35 

TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC 4320 

TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 43 80 

CGCTAATAAT AATGATGTAG TGTCGAATAA TACCGCATTA AATGT AC CAA CTAAAACAAA 4440 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4500 

TTCAAATAAA CCAGAGCTAG TTGC AATTG C TGAAC CAGCA TCTAATAGAC CGAAAAAGAG 4560 

AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG CGGCTGCAGC 4620 

GGTAGGAAAC GGTGGTG CAC C AGTTG CAAT TACAGCGCCA TATACGCCAA CAACTGATCC 46 80 

TAATGCCAAT AATGCAGGAC AAAATG CACC TAACGAAGTG CTGTCATTTG ATGACAATGG 474 0 

SO TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4800 

CTTCACACTA ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4860 
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TCGTATACAT 
AACAGTAAAT 
TCAAGGCGCA 
GACTGTTGAA 
CAAAATTCAA 
AAAAGATGGT 
TGTTTTTGTT 
AACAACATCA 
ATATCAAGTT 
TCCAAGTAAC 
TCGTGTGATA 
GCCTGATAAA 
AACAGTAACA 
AGCTGCAGAA 
AG ATG CATTA 
ATTAGATATC 
CAATGTACCA 
GCAACATACG 
AATGGAAGAT 
ACAAGTTATC 
TGATGGCGTT 
ACCGGTTGTT 
AATTATCAAT 
AGCTACGGAT 
TGAAACAGCT 
TAAAAAAGCT 
TAGTAATAGA 
AACCAACCAT 
CAAAGGAGAT 



GGAACTGATA 
CCGAATTCTG 
ACAAATGTTA 
GGCGGTCCAA 
TTTGTACCTA 
TACAAATACT 
GAAAGACGAA 
TTAAAGAATA 
CAATTACCTG 
AATTCAGGCG 
ACAATTAAAA 
ATACTCGATT 
TTTAACGAGA 
AGTCATACTG 
CAAGCCGAAG 
TTTAATGGTC 
TTAAATAAAA 
TTAATTCGAA 
TTAGTTAATC 
GAGGAACATA 
ACTAGAATCA 
AAACCAAATG 
GCAACACCAG 
GAAACAGATG 
AAAAATAATG 
GCAAGAGATG 
GAAGCAACTC 
GCTTTAGAAC 
GGTCTAAATG 



CGAATGACCA 
AATTAATCTT 
TTATCAAAAA 
CTTTGCGTTT 
AAAATGACGC 
ATAGCTTTGT 
CAATGGATCC 
ATGGTAATTC 
AAGGTGTTGA 
TTGATGTTAA 
GTACTGGAGG 
TAAGATATAA 
CATTAACGTA 
TAAGTACAAA 
TTGACAGACG 
TGAAACGACG 
GAGTTTCTCA 
GTGTTGATGC 
AAAATGATGA 
AAAATGAAAT 
AAGATCAAGG 
CTAAAAAAGC 
ATGCTACTGA 
CTATTGATAA 
GCATCAATAC 
CAATTAACCA 
AGGAAGAGAA 
AAATCAATCA 
CCATTAATCC 



TGGCGATTTT 

TGAATTTAAT 
TGCTGATACT 
ATTTAAAGTA 
AATAACAGAT 
TGACTCTATC 
AACAGCAACA 
TGGTGCTTCT 
ATATGTGAAC 
TGATATGAAT 
AGGTACAGCA 
ATTACGTGTA 
TAAAACATAT 
TC CAT ATACT 
TATTCAACAA 
CGCACAAACG 
AG CAT AT ATT 
TGAAAATGCA 
ATTGACAGAT 
AATTGGTAAT 
TATACAGACC 
AATACGTGAT 
AGACGAGATT 
TGTTACGAAT 
TATTGGAGCA 
AG CAACAGCA 
AAATG CAGCA 
AGCAACAACA 
AATTGCTCCT 



AATGGTATCG 
ACAATGACTA 
AATGATACGA 

CCTGATAATG 
GCGCGTGGCA 
GGACTTCATT 
AATAATAAAG 
CTAGATACAA 
AATTCATTGA 
GTTACATATG 
AACTCTCCGG 
AATAATGTGC 
ACACAAGATT 
ATCGATATCA 
GCTGATTATA 
ATTTTAGATG 
GATTCATTAA 
GTTAATAAAA 
GAAGAAAAAC 
ATTGGTGACC 
TTAAGTGGGG 
AAAGCAACGA 
CAAGATGCAC 
GCTACTACAA 
GTTGTTCCTC 
ACGAAAAGAC 
TTGAACGAAT 
AATGCTAATG 
GTAACTGTTG 



AGAAAGCATT 
CTAAAAACGG 
TTGCTGAAAA 
TGAGAAATCT 
TTTATCAACT 
CTGGGTCACA 
AGTTTACTGT 
ATGACTTTGT 
CTAAAGATTT 
ATGCAGCAAA 
CACGACTTAT 
CGACACCAAG 
TCATTAATTC 
TCATGAATAA 
CATTTGCGTC 
AAAATCGTAA 
CTAATCAAAT 
AAGTTGACCA 
AAGCAGCAAT 
AAACGACTGA 
ATACTGCAAC 
AACAAAGGGA 
TAAATCAATT 
ATGCTGACGT 
AAGTAACTCA 
AACAAATAAA 
TAACTCAAGC 
TTGATAACGC 
TTAAGCAAGC 



( 4980 
5040 
5100 
5160 
5220 
5260 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
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*TY* A TfZPfl A PT 


PA AG AAG AAA 


GAPAACPAGC 


AATTGACAAA 


GTGAATG CTG 


CTGTAACTGC 


6780 


AC P A A A P A P A 


AAPATTTTAA 


A PG CTAATAC 


CAATGCTGAT 


GTTGAACAAG 


TAAAGACAAA 


6840 


TC PC ATT PA A 


GGAATACAAG 


CAATTACACC 


AGCTACAAAA 


GTAAAAACAG 


ATGCAAAAAA 


6900 


TGPPATCGAT 


AAAAGTGCGG 


AAACGCAACA 


TAATACGATA 


TTTAATAATA 


ATGATGCGAC 


6960 


CPTPGAAGAA 


CAACAAGCAG 


CACAACAATT 


ACTTGATCAA 


GCTGTAGCCA 


CAGCGAAGCA 


7020 


A A AT ATTAAT 


GCAGCAGATA 


CGAATCAAGA 


AGTTG CACAA 


GCAAAAGATC 


AGGGCACACA 


7080 


A A AT AT ACTA 


GTGATTCAAC 


CGGCAACACA 


AGTTAAAACG 


GATACTCGCA 


ATGTTGTAAA 


7140 


TPATAAACPC 


PGAGAGGCGA 


TAACAAATAT 


CAATGCTACA 


ACTGGCGCGA 


CTCGAGAAGA 


7200 


PAAAPAACAA 


GPGATAAATC 


GTGTCAATAC 


ACTTAAAAAT 


AGAGCATTAA 


CTGATATTGG 


7260 


■T»PTP & PPTPT 


APTAPTCPGA 


TGGT PAAT AG 


TATTAGAGAC 


GATGCAGTCA 


ATCAAATCGG 


7320 


rrr apttpa a 


PPGPATGTAA 


PGAAGAAACA 


AACTGCTACA 


GGTGTATTAA 


ATGATTTAGC 


7380 


fi A PTfiPT A A A 


AAGPAAGAAA 


TTAATCAAAA 


CACAAATGCA 


ACAACTGAAG 


AAAAGCAAGT 


7440 


VjVJ<— X X l>vin 1 


PA ACTGG ATP 


AAG AG TTAGC 


AACGGCAATT 


AATmATATAA 


ATCAAGCTGA 


7500 


•T*RPA A ATVPP 


PA APT AC ATP 


A AG PG CAACA 


ATTAGGTACA 


AAAGCAATTA 


ATGCGATTCA 


7560 


txCvJAAAlAl X 


fiTTA A A A A AP 


PTC PAG PATT 


AGCACAAATC 


AATCAGCATT 


ATAATGCTAA 


7620 


ATT AG L I Aj M 


ftTPA ATPPT A 
AX K,t\t\ loLln 


PAPPAGATGP 


AACGAATGAT 


GAGAAAAATG 


CTGCGATCAA 


7680 


TACTi I AAA! 


P A ACZ AP ACAP 


A AP AAG PT AT 


TGAAAGTATT 


AAACAAGCTA 


ACACAAATGC 


7740 


ARAARTAGAC 


PAAG CTGCGA 


CAGTAGCAGA 


GAATAATATC 


GATGCTGTTC 


AAGTTGATGT 


7800 


AGTAAAAAAA 


CAAGCAGCGC 


GAGATAAAAT 


CACTG CTGAA 


GTGGcGAacG 


TATTGaAGCG 


7 B60 








"GAAAKGCAGG 


— CTGCTGTTAA 


-TCAAATCCAA- — 




GTTAAACAAA 


CACCTAATGC 


aactgacgaa: 


79-20- 


TCAACTTTAA 


AGATTCAAGC 


AATTTAATCC 


AAATTTAATC 


CAAAACCCAA ACAAATGGAT 


7980 


TCAGGGTAGG 


ACACCACTTA 


CAAATCCAA 








8009 



40 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

SO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 
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50 



AGATGAATGC 
TTATTCATCG 
AATTAATTAC 
AGCTGAAATG 
TTGTAGTTTT 
TCTAC CAAAT 
. CCATTCTAAT 
TTGTATTAAT 
CATAATCTGT 
AGTGATAGGT 
GTACAGTTCG 
AGGCATGCTT 
AG CATCAAAG 
ATCATGATTT 
ATCGGTCTTA 
TTTGTCTTTT 
GCAATCTGTA 
ATCTAAAGTT 
ATGTGCAATA 
ATTTGTATAT 
ATGTACAATC 
TTGGCAAAAA 
TAATAGTTTT 
AATAAAACCT 
TACATAACGA 
AGTCAAAGAA 
CATTC CTAAT 
TATATAATCT 
TTAGTATAAG 



TAACCATATT 

AGGTAAGTCG 
AG TTATGCGT 
TTATGAGATA 
AGAAATGCTT 
GGTC CTAATT 
TTTTCGGAAA 
TGCTTAATAA 
TGATATGGAA 
GTTTTATGCa 
TGAGGCATTG 
TTAGTACTTA 
TCATGTATTT 
AACAATTTAA 
TCATATACTA 
AAAATGT CAG 
GTGATAATTT 
TGAACTAAAC 
TGATCCATAA 
TGGTGCTGTT 
GGTGATCTTA 
GTTTCGTGTG 
TGTGCGATTG 
CTCTCTTTTT 
CATGAAAGGC 
AAAATCATTG 
TTGGAATCAT 
ACAATTTTGT 
GCGTTTTAAT 



CATTCTGCTA 
ACACATGTAT 
GGTACAGTTG 
TTCATAACTA 
GTTCAATGCG 
CTAAGTCTGC 
TAACAGGGTA 
GTGGTTGTGA 
CAGAATCATT 
CATTTATACG 
ATCCTTTATG 
AAAATGAAGA 
CAATATCTTT 
GTGCAAGATG 
CTTGATCAAG 
GCATAAACGG 
GTCCATCTTC 
CTTGAACCAA 
ACATTGTCAA 
TAGCGAATTG 
AATAAGGCAT 
GGTCAATGAT 
CAGTTCCCTG 
AAAACGTAAT 
AATTAAATTA 
AAAAAGAGGT 
CTTACTGCTG 
GTCTTTTAAG 
ATTATATGCT 



AAGATGGTCG 
GGGATATAAA 
CTATTAAACC 
CGGCTAGCAG 
TTCGGCAGCT 
AAAGCATCCT 
ATTACATT CG 
CATAAAATCT 
TTCAGTGTTA 
ACCATTTTTA 
ACGTTCGCGT 
CATATTTTTC 
ATTTAGCCAT 
TGCAGCAGTa 
TTCTTTCTCG 
AATATTTGTA 
TAACTTGATA 
GCAATCCTCT 
TTCAGGTCGT 
TTTTAGATGG 
TTCTATTCGA 
TGTTAATCGG 
TATGCCACCG 
AGTTACGATT 
AAGAGATATA 
AACAATGTCA 
TTTGTTGTTG 
TCTTCCGAAA 
GCTTTCATAA 



TGTTACTGCG 
AATTAAGAAT 
TTTAAAATAA 
TTTTTTTATG 
TTACGGCCAC 
GCGACAAATA 
TTGATAGGTG 
TGTTCAAAAC 
ATTACACCAC 
ATATGTTTTT 
TGTACAATGG 
GGACCTAACC 
AAATGAATCT 
ATGCCGCTAC 
AAGATATGAT 
CTGCCTATTG 
TGCCATTTGT 
AATTGATATT 
TGATAAGGAC 
AACGGTTGTG 
TTTGTATATG 
TCTGTTGTTA 
CCGATAATTG 
TATAATTATT 
TGTAGATAGG 
AAAGAwAACA 
ATTTATATTC 
TTTCATCGAC 
T CAT ATGACT 



ACAG CTGAAA 
GACAAAGAAC 
AAGAACTGCT 
CGCTATATTG 
CCATAACATT 
GATTTGGTAT 
CATCATAATT 
CAGTTGCAAC 
CACTAATTTG 
TAAGGCGTAA 
CATTTCTTTC 
AACCAGGATC 
TTTTATCGTT 
CAACGATATG 
TTACATTCTG 
CAATAACGAC 
CTTCTTGTTT 
GTTTAGAAGC 
CATAAAAAGC 
GATGTACGTG 
AGTTAAACCT 
ATCCG CTTG A 
TCCAATGCAT 
ATT AT CATAA 
GCGAATCTGT 
GCAGTAAAAT 
ATGATTTTGT 
TTTAGTCTTT 
TGAAAGAGGA 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT AATGACATTG CACCTAATGT TGATGCTTTT 1980 

CCGGCAGCAT GTGCACGTGA ATATACATCT TCAAGTCTCA ATAATCCTAT AGCTGCTAGG 2040 

GCGCTAATTA AAGCACCGAT GATAACAAAG ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

ATCATGTTCA ATCACCTTAC CTTTGTCCAT AAATTTAGAG AATACTGCAG TACCTAAAAA 2160 

AGCTAATATA CCAATCATCA TAATAACGAC AATCATGTAT TTAATATTTA ATAAAATACT 222 0 

GAATAATGCT ATAACTG CCA TTAATTGAAG ACCAATCGCA TCTAATGCGA CAACACGATC 2 280 

GGCAAGTGAT GGGCCTAGCA CAACGCGAAT GAGCATAGCT AACATAGAAA TGACAACTAT 2 34 0 

GATTAATGCA ATAACGATAA TAACATTATG ATTCATTATA TTTCGCCCAC CTCTCTTACA 2400 

ATTTTCTCTA ATGATGTTTT AATACTTTCT ACTTCTTGCT CTTTAGTTGA AAAATCTATG 24 60 

GCATGAATAT AAATTTTTGT ACGATCGTCA CTTACACCAA GCACTACAGT ACCAGGTGTT 2520 

AATGTAATTA AATTAGACAG CAAGACAATT TGCCAATCTT TTTTTAAATC TGTGTGATAA 2 580 

ACAAAGAATC CTGGTTCATT TTTAATCGAA GGTTTAATAA TAATTTTCAA AACATCAAAA 2 64 0 

TTAGCTTTAA TCAGTTCGAT TAAGAAAATA ATAACTAATT TAATAATACG ATATAGCGTG 2700 

ATGACATAAA ATCTACCTGG TAACACTCTG TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 2760 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG TAACTATTTG TCACAAACAA CCAAAACACT 2820 

GCGATAATAA AGTTTAATAC TAATTGTACA GCCATGTTAT TTACCTCCTA ATACAGCTTT 2 880 

30 AACGTAGGTT GATGGATTGT AGAATGTTTC TGCACCAGCT TTTACCATTG GATATAAGTA 294 0 

ATCTGCTGAC AATCCATATA AAACAGTTAT CACAACTGCA ACGATTG CAA TCGTAGTTAA 3000 

ATATTTGACG TCGACTTTGT TATTAAGATC ATATCCTTTT GGTTGAC CGA AAAAGCCTTG 3060 

35 TAGGAATATG CGAATGACAG AATATAATAC GACTAAACTT GATAATAAGA CGATGACACC 3120 

ACTTAAATAA AATCCTCTTT CAAATGTTGA TTGGACAATA AAAAATTTTC CATAAAAGCC 3180 
ACTGAGTGGG GGAATGCCAG CTAAACTTAA TGCTGCGATA AAGAATGACC AACCAAGTAC 324 0 

AGGATATCGT TTAATTAAGC CACCAAATTG TCTTAAATCA GCAGTGCCTG TAATTTTAAT 3300 
CATAATTCCG ATAAGCAAGA ATAATGCAAG TTTTACTAAC ATGTCGTGCA ATGTATAGTA 336 0 

AATAGC CCCA ATCATACCTG ACTCTGTCAT CATTGCAACG CCGACTAAGA TCACACCTAC 3420 
AGCAATCATG ACATTGTATA GGATGATTTT TTTAATGTTG GCATATGCAA CAGCACCGAC 3430 
ACAACCAAAG ATGATCGTTA ATAGTGCTAA GAATAAAATG ACATAATGTG AAAAGCTTAC 3 54 0 

SO ATTATCACTA AAGAATAGGC TCAATGTTCT AGCGATTGCA TAAACACCAA CTTTTGTTAA 3600 

CAAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTATGCACT AGGTAACCAA 3660 
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ATATTGACTA AGCCACTGTC 


■ ATGCGCTGAA 


i AGGTTAGCTA 


, ATTTATTGCT 


' TATATCTGCT 


3780 




AGATTCAATG TTCCTACTAC 


TGAATATAAA 


ATCGCTACAC 


CCATTACGAA 


GAAGGATGAC 


3640 


5 


GATACAACGT 


' TAACAAGAAC 


ATATTTTATT 


GTTTCTTGTA 


GTTGAATTTT 


TGTAGAACCA 


3900 




ATTACTAATA 


. AGAAATAAGA 


TGACATTAAA 


AATACTTCGA 


AAAATACGAA 


TAGGTTGAAA 


3960 


10 


ATGTCACCAG 


TTGTGAATGC 


ACCAATGATA 


CCTATTAACA 


TAAATAGTAC 


TGAAAAATAA 


4020 


TAATAATATC 


TTTCACGTTC 


AATACCAATT 


GTTTGGTATG 


AATATAAAAT 


CACAAT AG CT 


4080 




GTAATAATAA 


TACT AG TAAT 


TATTAGTAGG 


GCACTGAATA 


TGTCTAATAC 


AAAGACAATA 


4140 


15 


CTGTATGGTG 


CTTTCCATGA 


ACCTAGCTCT 


ACGCGTATTG 


GTCCATGTTT 


AACAACATTT 


4200 




GCTAAATTGA 


TAATTGCCGC 


GACCAAGGTT 


AATAATGTAC 


CGCCTAGTGC 


GACATAACGC 


4260 




TTTATAATAG 


GACGCTTTCC 


AATAAAGACA 


AGTAATATGG 


CTGTAATTAC 


TGGAATAACT 


4320 


20 


AGCGTTAACA 


CAAGCATATT 


ACTTTCAATC 


ATCTTCTGGA 


ACTCCTTTCA 


TACTCTCAAC 


4380 




GTTATCTGTG 


CCTAATTCTT 


TATATGTTCT 


AAATGCTAAT 


ACTAAGAAAA 


AGGCTGTTGT 


4440 




CGCAAgGCGA TAACGATTGC 


TGTTAAAATA 


AGTGCTTGCG 


GGaTAGGaTC 


AACATAGCTT 


4500 


25 


TTTACGTTCG 


CTTCATAAAT 


TGGAACAGTA 


CCATGTTTAA 


GTCCGCCCAT 


AGTTATTAAA 


4560 




AATAAATTTG 


CTGCATGTGT 


TAATAGTGTA 


GTTCCCATAA 


CAATTCGTAT 


CAGACTTTTA 


4620 




GACAAAACGA 


GATAGACACT 


AATTGCTGTG 


AGAATACCAC 


TAACAAAAAT 


CATAATAATT 


4680 


30 


TCCACTATTC 


GTTCTCTCCA 


ATCGAAATAA 


TAATTGTCAT 


GACAGTACCA 


ACTA CTG CAC 


4740 




ATAAAACACC 


GAAATCAAAG 


AATACTGCTG 


TTGTCATATG 


AACAGGTTCT 


AATATAAATA 


4800 


35 


ACGGTATATC 


AAATGTGACA 


TGCGTAAAGA 


AATTTTTGCC 


TAAAAACCAA 


CTTGCGATAG 


4860 


GCGTCGCAAT ACAAAAAACT 


AATCCGATAC 


CTATCAAGAT 


TTTAAAATCT 


AATGGGAAAA 


4920 




TTTTACGCAT 


TGTTTCTATA 


TCAAATGCAA 


TCGTAATGAT 


AACAAGTGAA 


CTTGCGAATA 


4980 


40 


ATAAT CCGCC 


GACGAAACCG 


CCACCAGGTG 


TATAATGTCC 


TGCTAAGAAA 


AGTGAAAAAC 


5040 




CAAAGACCAT 
GTTGTCTATT 


TACCATGAAA 


AAGATAATAA 
CAC CTCGTTA 


CTGCAGCAAA 


TTGCAAAATT 


AGATCATTTT 


5100 




CATGATTTTT 


CCTTGCGTTT 


GACGCTTTTT 


ACGTAATTTA 


5160 


45 


ATCATTGTAT 


ATACAGCTAA 


TCCTGCGATA 


CCAAGCACAG 


ATGACTCGAA 


TAAAGTATCC 


5220 




ATACCACGGA 


AATCAACAAG 


TATGACGTTT 


AC CATGTTTT 


TACCGTGAGC 


tAAATCATAA 


5280 




ACGTGCTCTT 


GATAAAACTT 


AGATATCGAT 


TCAAAATGTC 


TATTTCCGTA 


TGCAATTAAA 


5340 


SO 


CCGATAATAA 


TGACGGACAA 


ACCAACACCA 


CCAGCAATTA AAGCATTAGT 


AAGCTGGAAT 


5400 




GAGCGCTTTT 


CATTATAACG 


ATTTAAATTT 


GGTAAGTGGT 


AGAAGCATAA 


TAAGAACAAT 


5460 
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ATAAACAATA CAGACACAGC ATATCCAACT GCACTTAACA TAATGATGCT AAATAATCTT 5580 

GATTTAGCGA AAAGAATTAA AAAGGCAGCA CTTAATAATA AAATTACGAT ACAAACTTCG 5640 

AAAATTCTAA TCGGACTAAC GTCTTTAAAA TTAATGTTGA AAGGTACTGA GAATATAGTG 57 00 

ACAAATGTTA ATAAAATTAA TGCACCAAAA ATGATAACTA AATTATTACG TGAATAATCG 5760 

GTAACATAGC TATTCGTCAT CTTTTCAGAG TAGTTTGGAA TAACATTTGC ACTTCTGTTG 582 0 

TACCAATAAT TGAATGTTAG TTTACCAGGT TGTCGTTGCA ACAATTTCAC CCAATAACTA 5 88 0 

AATGTCACAA TTAGTAAGAT ACCTAAAATA TAAATCACTA ATGTTGATAA AAAGGCAGGC 594 0 

GTTAATCCAT GGAACATATG GAATTCAACA TCATCAATTA CCGTATGATT AATCGAAGag 6000 

TnAGCTGGTT CAATAATCGA ATTAGTTAAA ATGCCAGGGA ATAAACCAAA TACAATTACT 6060 

AATGTAGCTA AAATAGCTGG TGATAAAAGC ATTAATATTG ATACTTCGTG TGCTTTTTTA 6120 

GGTAATTGTT CAGGTTTATA TTGTC CGAAA AATATATGCA TTATAAATTT AATTGAATAT 6180 

ACAAATGTGA AGACACTGCC CACTATACCA ATGATTGGGA ATAGGT AG CC TAATGTATCA 624 0 

ACACTGAATA AATTTGCTTG GCTTGCTGTA AATGTTGTTT CTAAAAATGA TTCTTTTGAT 63 00 

25 AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT GCTGTAATAA CAGTGATTGT 6360 

AAATGAAATA GG CAT AATTG TTAGTAAGCC ACCTAATTTC TTAACATCAC GTGTACCAGT .6420 

AGAATGATCC ACTGCACCTG TAATCATAAA TAGGGCACCT TTAAATGTTG CATGGTTGAT 6480 

30 TAAATGGAAT ATTGCAGCCG TAAATGCAGC AGCATATATT TTG CTATCAT CGCCTTGATA 654 0 

GTGATAACTA ATGGCACCGA TTCCAAGCAT CGCCATAATC ATACCTAATT GGGATACTGT 6600 

TGAAAATGCC AGTATACCTT TCAAGTCTTG TTGTTTTGTT GCGTTTAGCG AAgCCCAGAA 666 0 

35 TAATGTAATT AAACCAACGA GTGTGACAGT CCATACCCAA CCTTGCGATG CTGCGAAGAT 6720 

TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC ATTGTTGCTG AATGAAGATA 6780 

AGCACTGACT GGTGTAGGTG CTTCCATTGC ATCTGGTAGC CAAATATAAA ATGGAAACTG 6840 

AGCAGATTTT GTAAAAGCAC CAATCATGAT TAAAATCATC GCAAAAATGA AGAATGGGCT 6900 
ATTTTGAATT TCAGAAGCAT GTTGAATCAT GTACTGAATG CTAAATGATT GTGTTGGTAT 6960 
AGCGAGTAAG ATGATACCAC CTAATAATGA TAGACCACCA AATACTGTGA TTATGAGCGA 7020 
TTTTTGAGCA CCATATATAG ATGCTTGTCG TTCGCGCCAG AATGAAATAA GTAAAAAACT 70B0 
AGAAAATGAC GTTAGCTCCC AGAATAAATA TAGAATAATA ACATTATCTG AAAGTACGAC 7140 
SO ACCTAACATT GCACCCATAA ATAGTAATAA ATAACAATAA AAATTCCCTA GTTGTTCTGA 7200 
CTTACTTAAG TAGCCGATTG AATATAATAC TACTAAACTG CCGATTCCTG AAATAAGCAA 7260 



40 



45 



55 
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CCAATTTAAG 


GITITCATTA 


CAGTATTACC 


TGACATCGTC 


GTTTTAATTA 


ATGTAAGCAT 


7380 




ATAAATAAAT ATGACGATAG 


GGACAGGTAA 


TACGAACCAT 


CCTAAATGTA 


TACGTTTAAA 


7440 


5 


AAATCTATAC 


AGGATAGGAA 


TAATGAGTGC 


GAATATTAAC 


GGTAATATCA 


CCGCAATATG 


7500 




TAACAAACTC 


ACTATGTTGT 


CCTCCTTTAA 


AAAATATTTA 


TG TTATTCAT 


TATACATGAA 


7560 


10 


TGAT AT AG TT 


CTGAAAAACG 


TACACACTCC 


TTGTTGTGCT 


TTATTTTCAG 


AaGTATTTAA . 


7620 


ATAAGAAGAA 


ACACGTPATT 


TTTTATTTAA 


AATTTTCTTT 


GTATTGAAGT 


GAATAATCTT 


7680 




C'lTI'l'AAGCG 


TG CTAAA fTA 


GCTAAAGACA 


TTTCAG CATG 


TTTTGTTTGC 


TGAGCTTTAA 


7740 


15 


GTTTAGTTTC 


TAAA tctczt zv 

innni ^- X vj x^\ 


ATTGCTTGTT 


GAAGTGAATC 


TTCATAGCGC 


AATACATCAA 


7800 




CATTGAAGTC 


W-VJXAirtJ. IVj J, 


GAACGTTTCG 


TATAGCGTTT 


TTCAAAATGG 


CTTAATGCTT 


7860 




TGCGGTCATG 


f?AAAZV ATZiflV 


CCTTCAGTTT 


CAGTAGGGTT 


ATGTAAATCA 


CCTTGTTTCG 


7920 


20 




nnV- 1 lull t_A 


ACTTTAACAA 


GGACATCGTC 


TCCATTTTCT 


TCAACAATCG 


7980 




TGACACCATA 




TTGTGTGAAA 


ATCGATATAG 


CTTCATGCTA 


TTTTCCTCCC 


8040 




TTAAAAf^TAT 




ATGTATCATA 


ACATGAATGG 


AGAATATAAA 


TGG CTAACTA 


8100 


25 


TC CACAG TT A 




TACAACAAGG 


TGAAATCAAA 


GTGGTTATGC 


ACACAAATAA 


8160 




AGGTGACATn 


TkC'&TTr' 2V 2V fcT* 


TATTTCCAAA 


TATTGCACCA 


AAAACAGTTG 


AAAATTTTGT 


8220 




GACACATGCA 


nnnnrt 1 vjVj X X 


ATTATGATGG 


AATCACATTC 


CACCGTGTCA 


TTAATGACTT 


8280 


30 


CATGATTCAA 


o\j X VjvS^-Ort X ^ 


CAACAGCTAC 


TGGTATGGGT 


GGCGAAAGTA 


TTTATGG CGG 


8340 




TGCTTTTGAA 


VJrt x X X X X 


CATTAAATGC 


ATTTAACTTA 


TATGGCG CAT 


TATCAATGGC 


8400 


35 


TAACTCAGGA 


CfT A A T zv X IV 

V- V_ X Art X J-\\— X 


ATGGTTCACA 


ATTT1TCATT 


GTTCAAATGA 


AAGAAGTACC 


8460 


TCAAAATATG 


TTAAGTCAAC 


TTGCAGATGG 


TGGCTGGCCT 


CAACCAATCG 


TTGATGCATA 


8520 




TGGCSAAAAG 


GGTGGTACAC 


CATGGTTAGA 


TCAAAAACAT 


ACAGTATTCG 


GTCAAATCAT 


8580 


40 


TGATGGTGAA 


aCTACATTAG 


AAGATATTGC 


AAATACAAAA 


GTGGG AC CAC 


AAGATAAACC 


8640 




ACTTCATGAT 


GTTGTAATTG 


AATCTATTGA 


TGTTGAAGAA 


TAATATCTAA ACATAATTAA 


8700 




CTACCAACAT 


TTTAAACTCG 


GATAAAGCTA 


ATTTATGAAT 


GGATTAGTAT 


ATATTCCAAC 


8760 


45 


gAAAATAAAT AAACTAATAT GATGAGCAAT 


CTCAATATAT 


TTATCaAGAA 


AGCACAGTTT 


8820 




TTAAATAGAT 


GTGTATTTTA 


AAGATAATAG 


TTGAGGTTGC 


TTTTTATGTT 


TTTACAGAGA 


8880 




ATTGCTATTC 


AAATAGTAAA 


TAAATTGAAA 


ACAAAGTAGC 


TGGATATCAT 


ATTGATTTAG 


8940 


SO 


ATAGGAATTT 


GTTGCTAATT 


TTATTTGTAA 


ATCCAAGTTT 


GTAGAATTCT 


TATTCATTTA 


9000 




TAAAATAATA 


TTCGTATGAT 


TTGATTTTTT 


AATTAGTCCA 


CCATTTCGAT 


TTGTGCTATG 


9060 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



AACATATCAA 

TCATACTGAA 
GAAATTTCTA 
AAAGCTTAAT 
AAAACAATCA 
AGAACGCTTA 
CAGATAAATC 
AAGAGATTAG 
AATATGaTTG 
TTAGATAGAA 
AAATGAGAAC 
GAATGAAAAA 
AAAGTAAATA 
ATCGATTTGT 
ATGTAGAACT 
CGAGTAATGT 
GTAATATTGA 
TCGTACAAAT 
TCGTAGCACC 
~GAGAAATG£C~ 
GTGCAATTGA 
ATCAATTTGT 
TCCCGGTCGC 
TTATAGTTGG 
TTACAGAGGA 
TGATGGATAC 
TAATTCACAA 
CTGACGAAGC 
AGCTACTACT 



GGTGCGTGTA 

GGACTGATTC 

TCAGAAGGCC 

CTATCATTAA 

GTATTAGATG 

CCAATCTGGA 

GTACCGAAAA 

TATCTATTAA 

ATTCTATTTA 

TCGTTGATTT 

TACAACTTAA 

ATTACATGTT 

CGAACCATTG 

GTTAGCCCCT 

TCCTTATATT 

GAGTGATGTC 

AGGACTAAAA 

ACATCATGGC 

AAGTCCAATT 

GAXTGAAGACT 

AGCAGGGTTT 

ATCACCATAC 

TGTGATTGAA 

ATACAGATTA 

ACTCGTTAAT 

GCATGCAACG 

ATGGATAAAT 

TTTAGATGCA 

GGATTATCAA 



CTGGTATTCA 
ATATATCAGA 
AAATTGTTAA 
AGGATAATGA 
AAATCAGAGA 
TAAAACAGTC 
TCATACAAAG 
ATTTTATTAG 
CACGTACAAA 
GCaATATTGT 
AGTATTAAAC 
ATAGTCAACT 
TTTGATAAAG 
TTAACACATA 
GAAAAGCGTT 
GGAAAAGCAT 
CGATTAGCTA 
GGTGCACAAG 
TCTTTAAAAA 
ATTGAACAAG 
GATGGTGTTG 
TATAATAGAA 
GAAGTACTTA 
TCTCCAGAGG 
AAAATTAGCC 
ACACGTGAAG 
GGTCGTATGC 
GTTGAAAATG 
TTTGTTGAAA 



ACCATACGGT 

AATTATGGAT 

AGCTAAAATT 

TTACTTCAAA 

AACAGAAAAA 

AAAGCGAGCA 

GGTCTGAAAT 

ATACTAATCT 

TGGTTTAAGG 

ATGTGGATTT 

GAATTGCAAC 

CAATAATTTT 

TAGAATTACC 

TTTCTTCAAA 

CACAAGATGT 

TTCCAGGACA 

CAGCAATGAA 

CATTGCCTGA 

GTTTTGGTCA 

"CAATCAAGGA" 

AAATACATGG 

GAAATGATGT 

AAGCGAAAGA 

AAGCGGAGTC 

ATATGCCAAT 

GTAAATACGC 

CACTTATCGG 

TTGGTGTTGA 

AAATTAAAGA 



GCGTTTGTTG 
GACTACGTTC 
TTGTCTATAG 
AATTATGAGC 
TATGGGTTTC 
ATTCGAAACG 
GAAAGTTTCT 
CTTTTTGTCT 
TGACATATCC 
GTTTTTTTTA 
TATATAAACA 
AAGGAGGAAT 
AAATGGAGTA 
TGATGATGGT 
TGGTATTACA 
GCCATCAATC 
GAAAAACGGT 
ATTAACACCT 
GAAACAAGAA 
"TTTTGGTGAA" 
CGCGAATCAT 
ATGGGCAAAT 
AGCGTATGGC 
TCCAGGAATC 
CGACTATATT 
TGGACAAGAA 
TATTGGTTCA 
CTTAGTAGCC 
TGGACGGGAA 



AGACCCCTAA 

ATAATTTGAA 

ATGATGAAGG 

GTAAGAAGGA 

AAACACTTAA 

ACTAAAGGAA 

TAG A CT AT AA 

ACGATAACGT 

ATTATCTTTG 

TTTATTTTAG 

GATAATTGGA 

TAAGTAATGA 

GAGTTGAGAA 

ACTATTTCAG 

ATTAATGCTG 

GCGCATGACA 

GCCAAAGCAC 

GATGGAGACG 

CATAGTGCTA 

GCAACC^GAC" 

TACTTAATTC 

CAATATAAAT 

AATAAAGACT 

ACAATGGAAA 

CATGTTTCAA 

AG ACTGC CTT 

ATTTTCACAG 

ATTGGTAGAG 

GATGAAATTA 



9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
~1032~CT 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
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AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 
(2) INFORMATION FOR SEQ ID NO: 63: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

w 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



15 


TTTGATAnAA 


AACTGAATnA 


ATTAAATGTA 


TCGATTCAAC 


CTAATGAAGT 


GAATTTACAA 


60 




GTTAAAGTAG 


AGCCTTTIAG 


CAnAAAGGTT 


AAAGTAAATG 


TTAAACAGAA 


AGGTAGTTTA 


120 




GCAGATGATA 


AAGAGTTAAG 


TTCGATTGAT 


TTAGAAGATA 


AAGAAATTGA 


AATCTTCGGT 


180 


20 


AGTCGAGATG 


ACTTACAAAA 


TATAAGCGAA 


GTTGATGCAG 


AAGTAGATTT 


AGATGGTATT 


240 




TCAGAATCAA 


CTGAAAAGAC 


TGTAAAAATC 


AATTTwCCAG 


AACATGTCAC 


TAAAGCACAA 


300 




C CAAGTGAAA 


CGmAGGCTTA 


TATAAATGTA 


AAATAAATAG 


CTAAATTAAA 


GGAGAGTAAA 


360 


25 


CAATGGGAAA 


ATATTTTGGT ACAGACGGAg 


TAAGAGGTGT 


CGCAAACCAA 


GAACTAACAC 


420 




CTGAATTGGC 


ATTTAAATTA 


GGAAGATACG 


GTGG CTATGT 


TCTAGCaCAT 


AATAAAGGTG 


480 




AAAAACACCC 


ACGTGTACTT 


GTAGGTCGCG 


ATACTAGAGT 


TTCAGGTGAA 


ATGTTAGAAT 


540 


30 


CAGCATTAAT 


AGCTGGTTTG 


ATTTCAATTG 


GTGCAGAAGT 


GATGCGATTA 


GGTATTATTT 


600 




CAACACCAGG 


TGTTGCATAT 


TTAACACGCG 


ATATGGGTGC 


AGAGTTAGGT 


GTAATGATTT 


660 


35 


CAGCCTCTCA 


TAATCCAGTT 


GCAGATAATG 


GTATTAAATT 


CTTTGGATCA 


GATGGTTTTA 


720 


AACTATCAGA 


TGAACAAGAA 


AATGAAATTG 


AAGCATTATT 


GGATCAAGAA 


AACCCAGAAT 


780 




TACCAAGACC 


AGTTGGCAAT 


GATATTGTAC 


ATTATTCAGA 


TTACTTTGAA 


GGGGCACAAA 


840 


40 


AATATTTGAG 


CTATTTAAAA 


TCAACAGTAG 


ATGTTAACTT 


TGAAGGTTTG 


AAAATTGCTT 


900 




TAGATGGTGC 


AAATGGTTCA 


ACATCATCAC 


TAGCGCCATT 


CTTATTTGGT 


GACTTAGAAG 


960 




CAGATACTGA 


AACAATTGGA 


TGTAGTCCTG 


ATGGATATAA 


TATCAATGAG 


AAATGTGGCT 


1020 


45 


CTACACATCC 


TGAAAAATTA 


GCTGAAAAAG 


TAGTTGAAAC 


TGAAAGTGAT 


TTTGGGTTAG 


1080 




CATTTGACGG 


CGATGGAGAC 


AGAATCATAG 


CAGTAGATGA 


GAATGGTCAA 


ATCGTTGACG 


1140 




GTGACCAAAT 


TATGTTTATT 


ATTGGTCAAG 


AAATGCATAA 


AAATCAAGAA 


TTGAATAATG 


1200 


SO 


ACATGATTGT 


TTCTACTGTT 


ATGAGTAATT 


TAGGTTTTTA 


CAAAGCGCTT 


GAACAAGAAG 


1260 




GAATTAAATC 


TAATAAAACT 


AAAGTTGGCG 


ACAGATATGT 


AGTAGAAGAA 


ATGCGTCGCG 


1320 
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CTGGTGATGG 


TTTATTAAPT 


wlnl 1 LMA 1 


1 MvjL 1 ILlol 


nAlAAAAAiVJ 


IV PTfi^TH Tv TAT 
ML 1 1 MMM 1 


1440 




CACTAAGTGA 


ATTAGCTGGA 


CAAATGAAAA 


AATATPPAPA 
nn x A x V- LnLM 


nlV^l 1 nn X X 


A A PPTA PPPP 
MMLv? X MLvjLVj 


1500 


5 


TAACAGATAA 


ATATCGTGTT 


GAAGAAAATG 


TTftAPfiTTAA 


AGAAGTTATG 

nVJnn\J X X n. X w 


APT A A APT AP 
ML X nnnVJ X Aw 


lbbO 




AAGTAGAAAT 


GAATGGAGAA 


GGTCGAATTT 


TAGTAAGAPP 


TTPTPP A A PA 

A A L A VjOMnLn, 


a APPATTAPT 


162 0 


10 


TCGTGTCATG 


GTTGAAGCAG 


CAACTGATGA 


AGATGPTGAA 


A P A TTTYt P 21 P 


AAPAAATAPP 
MMLMMM 1 MVjL 


IbBO 


TGATGTGGTT 


CAAGATAAAA 


TGGGATTAGA 


TAAATAAATA 
x rt/vn x nnn x n. 


PTPTATTAPA 
LlOXnl XMLM 


a ATP IV PPPP 2V 
MM I LiMVjLLv»M 


174 0 




TGCGTATGcA 


nTcg t TTTIT 


GTGTTTGTAG 


AAATAATTTA 

XW\X/V\X X Xn 


TAGTAPAAAP 

X SV>J X nLnnnL 


PTA A A ATP AT 
Vj X MMMM X on 1 


t q n f\ 
X o U U 


1 5 


ATAAACAAAA 


TAAAAACAAA 


GTAATCAATA 


TGTAATATAA 


AATACACTGG 


TACTCAATAT 


1860 




ATAATGATGA 


TAAAATTAAT 


x i A <TVr^ X AAO/\ 


X AVlAU X 1 V* L X 


llu XVjr 11111 


a zv /v p a jv *rv^ 

AHUj LMVjM I Vj 


1920 




CTACTACTTA 


TCTTAACAGT 


TG ATT A AP.TP, 


rtrVrt A V— r\ X X XM 


APAnpniiPa iv 

rVV- MkJ LvsMtyMM 


T"l TV TP* TV TV r*PA 
1 MM 1 LMMLLM 


1980 


20 


GGAGGATGAC 


TTAATGAATT 


TATTPA^ArA 

X / » X X v__rVlxnVb~n 


APAA A A ATTT 
nLA\nnnnX X X 


A PT A TP A /I IV A 
Aw 1 M 1 LM\? An 


AATTTAATGT 


204 0 




CGGTATTTTT 

* «» X X X X X 


TPAG("]"I u rA A 


ttp, p p a ptpt 


1 ML 1 i i inln 


TV " i 'TV /T*7V TV f~*f~* 
1 L 1 ML i MALL 


t~*f~* 7\ A T\^TVO/* , « 

L bAv_AACA(jL 


2100 




GTCTGCAGCA 


G AGCAAAAT P 


rVVJvv, X ULnLn 


A A ATP A A PP A 




L I vaM IVjLLAA 


2160 


25 


TACACAGCCT 


AACGCAAATG 


ptpptp ptp a 


APPTA ATPPT 
MIjiL Inn! X 


AP&PPAPAPP 


P A O /"""TV' TV 

lmvjl xolAll 


2220 




TGCCAACCAA 


GPAPAAPPAP 


PA(7TAP^Arr 


APPA A APPIV ZV 


PPTPP A P A. PP 


L 1 MM I CCAGC 


2280 




AGGAGGAGCA 


GCACAACCAA 


ATACACAACC 


AGCTGGACAA 


GGTGATCAAG 


CTGATCCGAA 


2340 


30 


TAACG CTGCA 


CAAGCAPAAP 


PTPP. A A 2VTPA 


app a artrrc 

<rivj LMMLML L ^7 


ppzvta zvppt± zvp 




24 00 




AAATAACCAA 


GCAACACCTA 


ATAATAATGC 


AACACCGGCA 


A ATP A A APAP 


APPP APPP7V TV 
Mu L LMAj LVaMM 


24 o U 


35 


TGCTCCAGCA 


GCAGCGCAAC 


CAGCAGCACC 


TGTAGCAGCA 


A APPPAPA A A 
MML LHLanM 


PTPTA ZVPTVTPP 
L 1 LMMAjM ILL 


2 D2 U 




AAATGCTAGC" 


"AATACTGGTG" 


"AXGG'CAGTAT" 


"TAATACGACA - 


1 lnnwii X lo 


ATY*2 IV'PPP'PP P 
M 1 VjM 1 LL IVjL 


*5 c q n 
23oU 




CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT 


PTT7A P21P ZV Ttt 


TV 2VPT*ZV IVIVTV^P 


2o4 0 


40 


TTATTCATTA 


ATTAACAACG 


GTAAGATTGG 


TTTCGTTAAC 


TP AP A ATT A A 
x Lnunn X X MM 


PJ A PP A A PPP 2V 
vjML VjMMLtLAjM 


2 /UU 


TATGTTTGAT 


AAGAATAACC 


CTCAAAACTA 


TCAAGCTAAA 


PPAAAPPTPP 


PTPPAT*P7xPP 


2 / D U 




TCGTGTGAAT 


GCAAATGATT 


CTACAGATCA 


TGGTAACTTT 


AAPPPTATTT 


P 2V A ZV ZV A r"lY3T 


o o o n 
£. C2U 


45 


AAATGTAAAA 


CCAGATTCAG 


AATTAATTAT 


TAACTTTACT 


APT ATPPZk. ZV IV 
ML 1 MX VjLMMM 


PP Ti TV T» A ^ ,r P A A 


2 o o U 




GCAAGGTGCA 


ACAAATTTAG 


TTATTAAAGA 


TGCTAAGAAA 


AATACTGAAT 


TAGCAACTGT 


2940 




AAATGTTGCT 


AAGACTGGTA 


CTGCACATTT 


ATTTAAAGTA 


CCAACTGATG 


CTGATCGTTT 


3000 


SO 


AGATTTACAA 


TTTATTCCTG 


ACAATACAGC 


AGTTGCTGAT 


GCTTCAAGAA 


TTACAACAAA 


3060 




TAAAGATGGT 


TATAAATACT 


ATTCATT CAT 


TGATAATGTA 


GGTCTATTCT 


CAGGATCACA 


3120 
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TAATACTGAA ATCGGTAACA ATGGTAATTT 


TGGTGCTTCA 


TTAAAAGCAG 


ATCAATTTAA 


3240 




ATATGAAGTA ACATTACCAC 


AAGGTGTAAC 


TTACGTTAAT 


AATTCATTAA 


CTACAACATT 


3300 


5 


CCCTAATGGT 


AATGAArtAna 


/"•»!» 7\ r-» » /—»rr^ •» linn 


GAAAAATATG 


ACTGTTAATT 


ATGATCAAAA 


3360 




TGCAAATAAA 


GTTAPATTTA 


CAAGCCAAGG 


TGTGACAACG 


GCACGTGGTA 


CACACACTAA 


3420 




AGAAGTTTTA 

X^VJ^vrwj A A A A 


TTPPP A A t a 

A av-\- ' — Hon 1 A 


A TV r TV~ ,, l "I**!' TV TV TV 

AA1L 1 I 1 AAA 


ATTATCATAT 


AAAGTTAATG 


TTG CGAAT AT 


3480 


10 


CGATACACCT 


AAAAATATTP 


A i 'l'i'l'AATGA 


AAAATTAACA 


TATCGTACTG 


CTTCAGATGT 


3540 




THTAATTAAT 


A JiTrtrrna a /-» 

r\r\ 1 Vjv_tj( ttAL 


/"» n /-» n tv /TP /-» tv 
LAbAAb 1 3.CA 


CTAACTG CAG 


ATC CATTTTC 


AGTAGCGGTT 


3600 


15 






GCAACAACAA 


GTAAACTCAC 


AAGTTGATAA 


TAGTCATTAC 


3660 


APA ACAf^PAT 


<"* a RTHTT TV O 7\ 

wiAl l\»CACiA 


TV TV TA TV m * * k 

ATA CAAT AAA 


CTTAAACAAC 


AAG CAGATAC 


TATTTTAAAT 


3720 




taAAtjA 1 OV-tjA 


A 1 LATGTTAA 


AACTGCAAAT 


CGTGCATCTC 


AAGCGGATAT 


TGATGGTTTA 


37B0 


20 


tj 1 AA^ 1 AAA 1 


1 ACAAGCTGC 


ATTAATTGAT 


AATCAAGCAG 


CAATTG CTGA 


ATTAGATACT 


3840 






AAAAGOI IAl 


AGCAG CACAA 


CAAAGTAAAA 


AAGTTACGCA 


AGATGAAGTT 


3900 




VJ*_AVjt_At_ lib 


xaactaaaat 


TAACAATGAT 


AAAAATAATG 


CAATCGCAGA 


AATTAATAAA 


3960 


25 


r*A a ArTAr'TAf 


LALAAGGIGT 


CACAA CTGAA 


AAAGATAATG 


GTATCGCAGT 


GTTAGAACAA 


4020 




Vl/"\ A Vj 1 \Jrt A A /\ 




TAAACCTCAA 


GCGAAACAAG 


ATATTATCCA 


AG CAGTTACA 


4080 






AAL-AAA1 AAA 


TV TV TV f**Y*f** TV TV 1* rn 

AAAGTCAAAT 


GCATCATTAC 


AAGATGAAAA 


AGATGTAGCA 


4140 


30 




1 luu 1 AAAA 1 


TGAAACAAAG 


GCAATTAAAG 


ATATTGATGC 


AGCAACAACA 


4200 




AATfiPAPA An 


i AVjAALt^ L_A 1 


TAAAACAAAA 


GCAATCAATG 


ATATTAATCA 


AACTACACCT 


4260 






f"**TA A fiflPRrP 
1 fw\bL/ibL 


AG CT CTlGAA 


GAATTTGACG 


AAGTTGTTCA 


AGCACAAATT 


4320 


35 




V- 1 1 X AAA1 t-L- 


T*/^ TV TV TV TV /"» T\ 

TGATACAACA 


AATGAAGAAG 


TAGCGGAAgC 


TATTGAACGT 


4380 




ATT A-A TdC A H 


innnb 1 HL 


tggtgttaaa 


GCAATTGAAG 


CGACAACGAC 


TGCACAAGAT 


4440 


40 


TTAHA A AG An 
a l nvjnrtrt'jnw 


1 1 AAAAALAjA 


^PV TV TV TV / Mil TV 

AG AAAT CTCA 


AAAATTGAAA 


ATATTACTGA 


CTCTACGCAA 


4500 


A f* A A A A A 


TA TV"* /""V 1 » A *T» TV TV 

A a\sC LTATAA 


<Y^/^ TV TV i'MIMII * * « 

TGAAGI 1 AAA 


CAAGCTGCAA 


CAG CT AG AAA 


AGCTCAAAAT 


4560 




GCTACAGTTT 


CAAATGCAAC 


AAATGAAGAA 


GTAGCAGAAG 


CTGATGCAGC 


AGTAGATGCA 


4620 


45 


GCTCAAAAGC 


AAGGTTTACA 


TGACATCCAA 


GTTGTTAAAT 


CAAAACAGGA 


AGTTGCTGAT 


4680 




ACAAAATCAA 


AAGTATTAGA 


TAAAATCAAT 


GCAATTCAAA 


CACAAGCAAA 


AGTTAAACCT 


4740 




GCAGCTGATA 


CGGAAGTAGA 


AAACG CAT AT 


AATACACGTA 


AACAAGAAAT 


TCAAAATAGC 


4800 


SO 


AATGCTTCAA 


CTACAGAAGA 


AAAACAAGCT 


GCATATACAG 


AATTAGATAC 


TAAAAAGCAA 


4860 




GAAGCAAGAA 


CAAATCTTGA 


TGCTGCAAAT 


ACAAACAGTG 


ATGTAACAAC 


AGCTAAAGAC 


4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5040 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 5100 

5 GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5160 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 5220 

GCAGATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA ATAACGGCTC AACAACTGAA 5280 

10 GAAAAAGCAG CTGCTAAACA AC AAG TTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 534 0 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 5400 

G AAG CG ATT C AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TG CTACG AAA 54 60 

15 

GCGAATGAAC GTAAAACAGC AATCG CTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 552 0 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 558 0 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAGTAT TGATCAAGTA 564 0 

20 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC 5700 

AAATTGCAAG AGATTCAAGc tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 576 0 

2S GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAC 5820 

GCACAAGTTG ATGAAG CTAA AGCAAATGCA GAAGCAGCGA TTAATGCGGT AACACCAAAA 58 80 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT 5 94 0 

30 GTTATCAATA ATGATCAGAA CGCTACAACA GAAGAAAAAG AAG CAG CT AT TCAACAATTA 6000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6060 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 6120 

" 5 AAATGAAATG _ CTAAAAATGA~ TGTTGATCAA~GCTGTGACAA — CTCAAAATCAT~AGCAATTGAT ~6T80 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 624 0 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACG CAAATT 6300 

40 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 6360 

GCGAAAGATG AATTAGCAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 6420 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 6480 

45 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 6540 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 6600 

So GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6660 

AATGAAGAAA AAGGTAACGA TATTGG AC CA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 6720 

£5 
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w 



15 



20 



AAAGTTCAAC AACTTCATGC AAATCCTGTT AAGAAACCAG CAGGTAAAAA AGAATTAGAT " 6B40 

CAAGCTGCAG CTGATAAGAA AACACAAATA GAACAAACAC CAAATGCATC ACAACAAGAA 6 900 

ATTAATGATG CAAAACAAGA AGTTGATACT GAATTAAATC AAGCGAAAAC AAATGTCGAT 6 960 

CAATCATCAA CAAATGAATA TGTTGATAAT GCAGTTAAAG AAGGAAAAGC TAAAATTAAT 702 0 

GCAGTTAAAA CATTTAGTGA GTACAAAAAA GATG CTTT AG CTAAAATTGA AGATGCATAT 7080 

AATGCTAAAG TAAACGAAGC GGATAACTCT AACGCATCGA CTTCAAGTGA AATTGCTGAA 714 0 

GCGAAACAAA AACTTGCTGA ATTAAAACAA ACTGCGGATC AAAATGTTAA TCAAGCTACT 720 0 

TCTAAAGATG ACATTGAAGT TCAAATTCAT AATGACTTAG ATAATATTAA CGATTACACA 7260 

ATT CCAACAG GTAAAAAAGA ATCAGCTACA ACAGATTTAT ATGCTTATGC AGATCAGAAG 7320 

AAAAATAATA TTTCAGCTGA CACTAATGCA ACACAAGATG AAAAGCAACA AG CAATT AAG 7380 

CAAG TTGAC C AAAATGTTCA AACTGCATTA GAAAGCATTA ATAATGGTGT GGATAATGGT 744 0 

GACGTTGATG ATGCATTAAC ACAAGGTAAA GCAGCAATTG ATGCTATTCA AGTAGATGCT 7500 

ACTG TTAAAC CTAAAGCGAA C CAAG CT ATT GAAGTTAAAG CAGAAGATAC GAAAGAATCT 7560 

25 ATTGATCAAA GTGACCAGTT AACTG CTGAA GAAAAAACTG AAGCATTAGC AATGATTAAA 762 0 

CAAATTACAG ATCAAGCTAA ACAAGGTATT ACTGATGCAA CAACAACTGC TGAAGTTGAA 7680 

AAAGCGAAAg cTCaAGGACT TGAAGCATTT GATAACATTC AAATCGACTC AACAGAAAAA 774 0 

30 CAAAAAGCTA TCGAAGAATT AGAAACTGCA CTAGACCAGA TTGAAGCAGG TGTAAATGTC 7 800 

AACG CTGATG CTACAACTGA AGAAAAAGAA GCGTTTACGA ATGCTTTAGA AGACATTTTA 7 86 0 

TCAAAAGCAA CTGaAGATAT TTCTGATCAA ACTACAAATG CAGAAATCGC TACTGTCAAA 7 920 

AATAGTGCGC TTGAACAACT TAAAGCACAA CGTATTAATC CTGAAGTTAA GAAAAATGCT 7980 

TTGGAAGCAA TCAGAGAAGT GGTTAACAAG CAAATAGGAA tAATTAAAAA TG CAGATGCA 804 0 

GATGCATCGG CGGAAAGAnA TTGCACGTAC GGGATTTAGG TAGATATTTT GGACCGATTT 8100 

GCTGGATAAA TTTAGGGTnA AACCCCAACC AATGCCGAAG TTGCCTGAAT TACCA 8155 
(2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1630 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 18 0 

AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 24 0 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TCATCTTTAT 360 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 42 0 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 4 80 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 

ATACATAAGT ACATAGTAAC TTAAAATTTT AT ATTT AG CA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 720 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 78 0 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 84 0 

CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 9 00 

AT CTT ATG AT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 1020 

30 GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 1060 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 114 0 

GCAATACCGC CATGTGGTGG TG CAC CAT AT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 12 00 

TCCTgTGCTT"GTTCTTTAGT~AAATCCAA 12 60" 

TCATGAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 13 2 0 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AG CTT AG CAA TATCAGCTTC TTTTGGAGAT 13 8 0 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CATCATATTC TAATAATGGC 144 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 15 00 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

CAAAGAAACG 1630 
(2) INFORMATION FOR SEQ ID NO: 65: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 
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15 



20 



35 



45 



60 
120 
180 
240 



(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 65: 
CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 
CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 
ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 
CTATATGACC TAAATACATA ACT CCAATGA CAT CACTT AT ATGTTTTACT ACACTTAAAT 
CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAAT CTTTT AATAAATTCA 300 
GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 36 0 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 
CATGTG CATA TT t ATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 
CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 
25 TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG AT CTTGAAAT AT CAT CTGAT 600 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 66 0 

CAATTATTGA GCCTGAAGTT GCATCTTCAA G C CTGATAAT CACTTTACCT AACGTTGACT 72 0 

30 TACCACAACC CG 

(2) INFORMATION FOR SEQ ID NO: 66: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5838 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



40 



420 
480 
54 0 



732 



60 
120 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
AATATATTCA TATGTTTCAT CAACAATATT AGCTG CTTTT TGAATTAAAG CAATTTCGTC 
AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 
GCTTTTATTT AATTCAAGGT ATG TAT CATA ACTTACATGA TGCCCCTCAA AACCTACATT 180 
TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 24 0 

50 AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 300 

CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GT AC CAGT AA AACCTGATAA 



360 



ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 42 0 
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CAACTTTATA CATTAAAATA ATATCATAAT AAGGATAAAA AATAATAGAT ATTGATTTTA 540 
GGGAGATAGT AATGAAAAAA TTGGTTTCAA TTGTTGGCGC AACATTATTG TTAGCTGGAT 600 
GTGGATCACA AAATTTAGCA CCATTAGAAG AnAAAACAAC AGATTTAAGA GAAGATAATC 660 
ATCAACTCAA ACTAGATATT CAAGAACTTA ATCAACAAAT TAGTGATTCT AAATCTAAAA 720 
TTAAAGGGCT TGAAAAGGAT AAAGAAAACA GTAAAAAAAC TGCATCTAAT AATACGAAAA 780 
TTAAATTGAT GAATGTTACA TCAACATACT ACGACAAAGT TGCTAAAGCT TTGAAATCCT 84 0 

ATAACGATAT TGAGAAAGAT GTAAGTAAAA ACAAAGGCGA TAAGAATGTT CAATCGAAAT 9 00 

TAAATCAAAT TTCTAATGAT ATTCAAAGTG CTCACACTTC ATACAAAGAT GCTATCGATG 96 0 

GTTTATCACT TAGTGATGAT GATAAAAAAA CGTCTAAAAA TATCGATAAA TTAAACTCTG 102 0 

ATTTGAATCA TGCATTTGAT GATATTAAAA ATGGCTATCA AAATAAAGAT AAAAAACAAC 1080 

TTACAAAAGG ACAACAAGCG TTGTCAAAAT TAAACTTAAA TGCAAAATCA TGATAGGAGT 114 0 

CTTTTAATGC GTAATATAAT ATTTTATCTT GTACTTATTA TTGCTGCGAT TGGATTAGTA 12 00 

ATGAATCTAG ATGCCTTTAT TTTTTCAATC GTCAGAATGT TAATCAGCTT TGcgTAaTAG 12 6 0 

CTGGTATTAT TTATCTGATT TATTATTTCT TCATCTTAAC TGAAGACCAA CGCAAATATC 1320 

GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT CAAAGAAGAA AATAGATAAA AAAACGGAAG 13 80 

CACTTGTAGG TAAAATAGTC TACGTGCTTC CATTTTTTAT TCTAAAAACT ACTTTCTAAA 144 0 

CATCCATTCA TCTGAACGAT ATTTTTCAGT TAATTCTTCC ACTTCTGCCA ATTGAGCTTC 1500 

TG t TAATTCA AGTGGCTTTA ATT CT AT ATT TAAACCTTTC TTAAAACCTT TCTCGAAAGC 1560 

TTCTTCCATT TGACTAATAG TAATGTGTTC ATCTGAAATA TCATTGATGG CAACTGCTTT 1620 

TTCAACGAAT GCCTCTTTCA TTTTTAATTT TAATCTTTCA TTTTTATAAA TrAACATATC 1680 

AAACAGTTCA TCAATATCAA TATCTTGTAA AATCGAACCG TGTTGGAGGA TTACGCCCTT 1740 

TTGTCTCGTT TGAGCACTCC CAGCAATCTT ACGGCCTTCA ACAACTAGCT CATACCAACT 1800 

TGGTGCATCA AAACACACTG AACTTCGAGG TTGTTTTAAT TTTTGACGCT CTTCAGGCGT 1860 
TTTAGGTACC GCAAAATAAG TATCAAATCC TAAGTTTTTA AATCCTTCTA ATAATCCTTG 1920 
TGAAATCACT CTGTACGCTT CTGTAACTGT AGAAGGCATA TTCGGATGCG ATTCAGGCAC 1980 
AATCACACTG TAAGTTAACT CTTT AT CATG TAGCACCCCA CGGCCACCAG TTTGACGCCT 2040 
TACGAGACCA AAACCTTTCT CTTTAACCTT ATCAATATCA ATTTCTTTTT GTAGCCTTTG 2100 
GAAATACCCT ATTGATAATG TTGCAGGATT CCATGTGTAA AAACGTATAA CTGGATCAAT 2160 
TTCACCTCTA GAGACAAAAT TTAATAACGC TTCATCCATT G C CAT ATT AT AATATGGGTC 2220 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 2340 

TCTTTAGTGA AATATTTTGA T AATTTG AC C TAACAGTCTT ATAATTATAT TATCGTTTAA 24 00 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2460 

TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2 52 0 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 2580 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 264 0 

ATTC CAAGG A TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTG CT AG 2 70 0 

CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 276 0 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 2820 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 2880 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 294 0 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 3060 

TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 312 0 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAG CTG TAG CTTCGTC TAATCGATCA 324 0 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 3300 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 33 6 0 

TCAATCATCA TACCTTCTTC AACATTTAAT GGG AAGTATA TTGTTGGTGG ATGTACACCG 3420 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 3480 

CCACTTAACA CAAACTCGTG TTT A CAAT AT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 3 54 0 

40 AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 3600 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTAC CAT AA 3660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGT CATTAT CATATTTAAA TTTGTCGCCA 3720 

TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 3780 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 384 0 

ACAG CATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 900 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4020 

ss 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGGcACA 414 0 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 4200 

GCAGTCCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 4260 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4 320 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGGTATTC TAGCAACCTT TTCATTAATT 4380 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 444 0 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4 500 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4 560 

AT AT CACTTT TTGGTAATGA ATATGCATAT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4620 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCAT C 4680 

TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 4 74 0 

ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 4 800 

TGGTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT ACCATCTAAT ACTTCAAAAC 4 860 

CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAG CATGTTC TATATTTTGA ACTGCAATAT 4 920 

CATAGATACC TTGTTTACCA AGTGCTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4 980 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 504 0 

GTAATGTTAA TACAAAGCCA CGATT AC CTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACG CATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 5160 

ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC 5220 

CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT 5280 

TATCFTCAAT AAAG CTATG A ATCTTTTCAA GATCTTCAAT TGAACCGTAA AAGTTTGGAT 534 0 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT TTCAAATCTG 5400 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TT AG CAT AAG 54 60 

TATGAAGTAC TTGTAATG CT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 5520 

TTGTTTGACT AAATGCTAAG ATACATG CTT CAGCAAAGCT AGTCATCCCA TCATACATAG 5580 

AAGAATTTGC TACATCCATA TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG 564 0 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 5700 

ATCTTGAAAT CAT AG CAT C C ACAACTGATG GCGCGTAATG ATCATAAACA CCAGCACCCA 5760 

rAAATGATGT ATGCGTTTCT TTAGTGATAT tCTTGCTkGC AATGGGGATT TAAACnTCTA 5820 
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(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 





(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


67 : 








ATnATAATTG 




TAATTACTTC 


CCTGAATTAC 


aAGTATT AG C 


AAACGAAATA 


60 


15 


AAATCTGATA 




ATTAAAACAA 


TGATATTTTT 


ATTTAAATTT 


TTaAAGCTTT 


120 




GTACGAAATT 


<j 1ALAAAGCT 


TT J jrxi"GGTGC 


GTATTGTATG 


GG CAACAACT 


TGACGATGAA 


180 




AATCCGTTAC 


ft f~*f^ TV *'l " 1 < /**'/* N "T>7\ 


ATAGGAAATG 


TTAGCGAAAG 


ACAAGGGTAT 


CCATTGTAGA 


240 


20 


TTAACAAAAG 


111 t-CA 


CAAGTGTGGG 


TTATTCTCAC 


TAAAGCAATA 


CGCAGAGACA 


300 




ACTTACGTAA 


AA III rtiAAC 


TGACTAGAAC 


GGAACTTCTA 


CTCAATTATT 


GATAAAAATT 


360 




TTCAAAAAGA 


tl 1 IjAA. 1 CjTG 


CTGAGAATAC 


GAAGTTTATG 


GAAGGATTAT 


CAAAATATAA 


420 




ATGTGCATTC 


nil 1ALAALL 


TTTATTGACA 


ATGATTCTCA 


ACTAATATAG 


TATATAATCA 


4B0 




AATCGTAATA 


ul 1 MLbA 111 


GTTTTCTGCA 


ACTTTTTTGA 


AGTTTTAGTT 


GAGGTGAAAA 


54 0 


30 


CAATAAAAGC 


L li\M.o 1 L»A 


ATGTAGTTAA 


CGGACAACTG 


CATTCGCTTG 


TAGAGCCACA 


600 


AG AAG CAACT 




TTTACGGTTG 


CATTTTGATA 


CAACAAC CG A 


TTACTAAGTC 


660 




ATGCTTTCCA 




TAGCATGACT 


TACCTAATAG 


ATAGAGCTAT 


TAGGTTCAGC 


720 


35 


TTCTAAAAAA 


1 1 ALA\j 1111 


AGAGGAATAC 


AGTTGcTTGc 


tTCGCAACAA 


CTGCATAAGA 


780 




GCCATGGTTT 


TCGCTTTTGC 


G AATT AG CAT 


GACTTACCTA 


CTAGATAGAG 


CTATTAGGTT 


840 




CATCTTCTAA 


AAAATTACAG 


GTTTAGAGGA 


ATACAGTTGT 


TTGcTTCGCA 


ACAACTGCAT 


900 


40 


AAGAGCCTCT 


AGTAATTAAA 


ATTACAGAGG 


CTCTAAAAAT 


ACATCTAAAG 


GAGTGTCGTA 


960 




TGAATCGGCA 


GGTTATAGAA 


TTTTCTAAGT 


ATAATCCTTC 


GGGGAATATG 


ACGATACTTG 


1020 




TTCATTCAAA 


ACATGATGCT 


AGTGAATATG 


CATCTATCGC 


CAATCAGTTG 


ATGGCCGCAA 


1080 


45 


CACATGTATG 


CTGTGAACAG 


GTAGGCTTTA 


TAGrATCAAC 


ACAAAATGAT 


GATGGTAATG 


1140 




ATTTTCACTT 


AGTTATGAGC 


GGTAATGAAT 


TTTGCGGTAA 


TGCGACGATG 


TCATATATAC 


1200 


SO 


ATCATTTGCA 


GGAAAGTCAT 


TTGCTTAAAG 


ACCAACAGTT 


TAAGGTGAAG 


GTGTCTGGCT 


1260 


GTTCGGATTT 


AGTGCAATGC 


GCAATTCATG 


ATTGCCAATA 


CTATGAAGTT 


CAAATGCCAC 


1320 




AAGCCCATCG 


TGTTGTGCCA 


ACAACAATTA 


ATATGGGTAA 


TCATTCATGG 


AAAGCAATAG 


1380 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG AgcAACAATG GAGTCACAAA TATAAAACAG 1500 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 1560 

5 AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAg C ATCA ATTGGGGTTT 1620 

TTAATAATTA TCAACGTAAT GACGCATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 1680 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATTAAAGGAC 174 0 

10 AGGTTACAAC TGT AG CTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 1800 

TTTAATAATG AAATCAAATT GATATTACAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 186 0 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACGTTGAA 1980 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 204 0 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAGCTAG CAATGACAAA 2160 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TT CAAGTAGC AAAAGAAACA 2220 

25 GGTGCTTCAG TTATCGGTAT TGATATTGAT CCACAAGCCG TTGACCTAGG GCGCAGAATC 2280 

GTTAACGTCT TAGCACCAAA TGAAGATATA ACAATTACGG ATCAAAAGGT AT CTG AACTT 234 0 

AAAGATATCA AAGATGTGAC G CAT AT CAT A TTCAGCTCGA CAATTCCTTT AAAGTACAGC 24 00 

30 ATTTTAGAAG AATTATATGA TTTAACAAAT GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 24 60 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2520 

TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA 2580 

35 G CT ATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG ATAGGCACTG GTCCgGTCGC 2640 

AATGCAATTA GCGAATATTT GCTATTTAAA AT CAGATT AT GAGATTGATA TGGTTGGACG 2700 

TGCCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 2760 
TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2820 
TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2880 
AGCAGATGCT TATTATGACA CACTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2940 
ACATGTCATT TTAATATCAC CGACATTTGG TTCGCAAATG ATTGTCGAAC AATTTATGTC 3 000 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTCTCAACT TATCTTGGCG ATACACGTAT 3060 
so TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT 3120 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG CTGAGCAATT 3180 
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TTATGTGCAC 


: CCACCACTAT 


TTATGAATGA 


CTTTTCATTG 


AAAGCCATTT 


TCGAAGGAAC 


3300 




AGATGTACCG 


GTTTATGTGT 


ATAAGTTATT 


TCCTGAAGGA 


CCGATAACGA 


TGACACTAAT 


3360 


5 


CCGTGAAATG 


CGTTTAATGT 


GGAAGGAAAT 


GATGGTTATT 


TTACAAG CAT 


TTAGAGTGCC 


3420 




GTCAGTCAAC 


CTGCTTCAAT 


TTATGGTGAA 


GGAAAATTAT 


CCAGTACGTC 


CTGAAACTTT 


3480 


10 


GGATGAAGGT 


GATATTGAGC 


ATTTCGAAAT 


CTTGCCAGAT 


ATCTTACAAG 


AATATCTGCT 


3540 




TTATGTAAGA 


TATACCGCAA 


TCCTCATTGA 


TCCATTTTCA 


CAGCCAGACG 


AAAACGGACA 


3600 




TTACTTTGAT 


TTTTCAGCTG 


TAC CATTTAA 


GCAAGTCTAT 


AAAAATGAAC 


AGGATGTTGT 


3660 


15 


TCAAATTCCA 


AGAATGCCAA 


GTGAAGATTA 


TTACAGAACG 


GCGATGATTC 


AGCATATTGG 


3720 




GAAAATGCTA 


GGTATCAAAA 


CGCCAATGAT 


TGATCAGTTC 


CTAACTCG CT 


ATGAAGCAAG 


3780 




TTGCCAGGCG 


TACAAGGATA 


TGCATCAAGA 


TCAACACTTA 


TCTTCTCAAT 


TTAATACAAA 


3840 


20 


TCTATTTGAA 


GGAGATAAAG 


CACTCGTCAC 


AAAATTTTTG 


GAAATCAATA 


GAACGCTTTC 


3900 




ATAATAAGGG 


TTTGAAGTTT 


TATAATAGAA 


AAAAATTATT 


GAATTATGTT 


TGACATTTAC 


3960 




ATAAAAATAA 


GCAAATAATT 


GAGAAAAATA 


ATCATTACGA 


TTTGATTAAG 


TAATGCAACT 


4020 


25 


TATCAATTTA 


GAAAGAGGAA 


AAGCAAATGA 


GAAAACTAAC 


TAAAATGAGT 


GCAATGTTAC 


4080 




TTGCATCAGG 


GCTAATTTTA 


ACTGGTTGTG 


GCGGTAATAA 


AGGTTTAGAG 


GAGAAAAAAG 


4140 


30 


AAAACAAGCA 


ATTAACGTAT 


ACGACGGTTA 


AAGATATCGG 


TGATATGAAT 


CCGCATGTTT 


4200 


ACGGTGGATC 


AATGTCTGCT 


GAAAGTATGA 


TATACGAGCC 


GCTTGTACGT 


AACACGAAAG 


4260 




ATGGTATTAA 


GCCTTTACTA 


GCTAAAAAGT 


GGGATGTGTC 


TGAAGATGGG 


AAGACATACA 


4320 


35 


CGTTCCATTT 


GAGAGATGAC 


GTTAAATTCC 


ATGATGGTAC 


GCCATTTGca 


TGctGACGCA 


4380 




GTTAAGAAAA 


ATATTGACGC AgTTCAAGAA AACAAAAAAT 


TGCATTCTTG 


GTTAAAGATT 


4440 




tcgAcattaa TTGACAATGT 


TAAAGTTAAA 


GATAAGTACA 


CGGTTGAATT 


GAATTTGAAA 


4500 


40 


GAAGCATATC 


AACCTGCATT 


GGCTGAATTA 


GCGATGCCTC 


GTCCATATGT 


ATTTGTGTCT 


4560 




CCAAAAGACT 


TTaAAAACGG 


TACAAcAAAA 


GATGGCGTTA 


AAAAGTTCGA 


TGGTACTGGT 


4620 




CCATTTAAAT 


TAGGTGAACA 


CAAAAAAGAT 


GAGTCTGCAG 


ACTTTAACAA 


AAATGATCAA 


4680 


45 


TACTGGGGCG 


AAAAGTCTAA 


ACTTAACAAA 


GTACAAGCAA 


AAGTAATGCC 


TGCTGGTGAA 


4740 




ACAGCATTCC TATCAATGAA AAAAGGTGAA 


ACGAACTTTG 


CCTTCACAGA 


TGATAGAGGT 


4800 


50 


ACAG AT AG CT 


TAGACAAAGA 


CTCTTTAAAA 


CAATTGAAAG 


ATACAGGTGA 


CTATCAAGTT 


4860 


AAGCGTAGTC 


AACCTATGAA 


TACGAAAATG 


TTAGTTGTCA 


ATTCTGGTAA 


AAAAGATAAC 


4920 




GCTGTGAGTG 


ACAAAACAGT 


CAGACAAGCG 


ATTGGT CAT A 


TGGTAAACAG 


AGATAAAATT 


4980 
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ACAGACATTA 


ATTTCGATAT 


GCCAACACGT 


AAGTATGACC 


TTAAAAAAGC 


ft ft TV •"Pf ft TTft 

ACxAAlv_Al 1A 


blOO 




TTAGATGAAG 


CTGGTTGGAA 


GAAAGGTAAA 


GACAGCGATG 


ff*t n/v^m/^ TV TV T\ TV 

TTCGTCAAAA 


tv f~* tv •m/"2 r P ft. ft. 71 


ri r*v 

b lb D 


5 


AACCTTGAAA 


TGGCAATGTA 


CTATGACAAA 


GG TTCTTCAA 


GTCAAAAAGA 


TV i^TV ft /"*/"» ft f"* 11 TV 

A<-AA1jLIAvjA*\. 


coon 




TACTTACAAG 


CAGAATTTAA 


GAAAATGGGT 


ATTAAGTTAA 


ACATCAATGG 


O/ - * TV TV TV /"'Tv TP ft 

LvjAAAw\ 1 V_A 


b^oU 




GATAAAATTG 


CTGAACGTCG 


TACTTCTGGT 


GATTATGACT 


T*TA TV 'I*/'" f«l>/™»K TV 

TAATG 1 i CAA 


/^/~» TV Tv ft PTTY 1 /^ 


b J4 U 


10 


GGATTATTGT 


ACGATCCACA 


AAGTACTATT 


GCAGCATTTA 


JV TV ^ TV /^» IV » TV TV TV 

AAGACaAAAAA 


^ TY~ , /~• r ^ ,r T , I^*T^ , a a 

ITjiVj 1 1 A I L»A/\ 






AGTGCAACAT 


CAGGCATTGA 


GAACAAAGAT 


AAAATATACA 


TV TV /*'/"' 7\ 'l*M A 

ACACjCAI ICjA 


1*jAv_IjCA1 1 1 




15 


AAAATCCAAA 


ACGGTAAAGA 


GCGTTCAGAC 


GCTTATAAAA 


ACATTTTGAA 


t\ tv Tv tv T* r P*™' TA T 
AC-AAAl ILjAI 


c c o n 


GATGAAGGTA 


TCTTTATCCC 


T ATTT CACAC 


GGTAGTATGA 


CAGTTGTTGC 


TV s TV TV TV TV T 1 


bbo U 




TTAGAAAAAG 


TATCATTCAC 


ACAATCACAG 


TATGAATTAC 


CATTCAATGA 


AATGCAGTAT 


C £ A f\ 

564 0 


20 


AAATAAAGGA 


GCAATTAGAT 


GTTCAAATTT 


ATCTTAAAAC 


GTATTGCGCT 


CATGTTT C CA 


C 1 f\ f\ 

b 700 




TTGATGATTG 


TAGTAAGTTT 


TATGACATTT 


CTATTGACGT 


ATATTACAAA 


fT</^ TV TV TV TV /"■'/'*• TV 

TGAAAATC CA 


e o ^ a 




GCTGTGACAA 


TTTTACATGC 


ACAAGGGACG 


CCAAATGTAA 


CACCAGAGTT 


«V imTVTip1\ fv TV 

GATTGCAGAA 


C O O A 

5320 


25 


ACGAATGAGA AGTACGGTTT 


CAATGATCCA 


TTATTAATTC 


«t *«ft m ft ft ft ft ft 

AATATAAAAA 


TTGGTTACTT 


588 0 




GAAGCGATGC 


AATTTAATTT 


TGGTACAAGC 


TACATTACAG 


^^m^* % ^^^^^^ ft ^^^n 

GTG AC CCAGT 


TGCTGAACGT 


5 94 0 




ATTGGTCCAG 


CATTTATGAA 


TACATTGAAA 


^ t\ t~+ TV TV »"p TV TV 

1 1 AACAATAA 


ill 1 Vj 1 


1 /\ 1 \ jVj 1 Vjri 1 vj 


D W U \J 




ATTACATCAA 


TTATTTTAGG 


TGTAGTTAGT 


GCATTAAAAA 


/•« m /-» TV TV TV /^n*r' 

GAGGAAAGTT 


/~i TV ^TV* TV T , /^/ r ^T l 

CACTGATCGT 


C C\C A 

bUbU 




GCGATACGTT 


CAGTGGCTTT 


CTTTCTAACT 


GCATTACCAT 


/■^ fv TV TV T 

CATATTGGAi 


AIjv- I I UAA I A 






CTTATTATTT 


ACGTTTCAGT 


GAAGTTAAAC 


ATATTG CCG A 


CTT CTGGATT 


ft a Ary^TPP a 
AA^JAiiu 1 V-V-A 


(Ti on 

D X O \J 


35 


GAAAGTTACA 


TATTGCCAGT 


GATCGTTATT 


ACGATTGCCT 


ATGCTGGTAT 


TTACTTTAGA 






AATCTTAGAC 


GCTCGATGGT 


GGAACAATTA 


AATGAAGATT 


ATGTACTTTA 


rpn^n* 1 tv tv ^ Tv r'pTi 
11 1 AALjAvj LA 


odd 
□ ^ u vj 


40 


AGCGGTGTGA 


AATCTATCAC 


ATTAATGTTG 


CATGTGTTGC 


GTAATGCTTT 


tv tv tv c^TT^r* r^c* 




GTATCAATCT 


TTTGTATGTC 


TATACCAATG 


ATAATGGGTG 


GACTAGTTGT 


*l , TA*r , /"V"»TV/" ,, *T , TA 

TAT UVjAvjTA 1 


£ A *5 A 




ATCTTTGCAT 


GGCCTGGACT 


AGGTCAATTA 


AGTTTAAAAG 


TV TV T'TV ^TT/^ TV 

CAATACT TGA 


AL-AV-oA 1 1 1 1 


£ a an 

On OU 


45 


CCAGTCATTC 


AAGCATATGT 


ATTAATTGTA 


GCGGTATTAT 


TTATTGTATT 


rr»TV TV TV O TV '1 t<"P TV 

TAATACATTA 


b j^U 




GCAGATATCA 


TTAATGCGCT 


ATTAAATCCA 


AGATTAAGGG 


aGGGCGCACG 


ATGATAATTT 


6600 




TAAAmCGATT 


ATTmCArGwT 


AAAGGTGCAG 


TAATTGCTTT 


AGGCATTATT 


GTATTATATG 


6660 


SO 


TCTTTTTAGG 


ATTAGCAGCA 


t CCACTTGTGA CATTTTATGA 


. TCCTAACCAT 


' ATCGATACAG 


6720 




CAAACAAATT 


TGCTGGCATG 


1 AGTTTTCAAC 


I ATCTACTAGG TACTGACCAT 


' TTAGGTAGAG 


6780 
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TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATTCTT ATCAGGATAT TTCCAAGGGT 
TTGTTGACGC CTTAATCATG CGTGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 
TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAATATTATC ATGGCATTTA 
TTTTGACGCG TTGGGCATGG TT CTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 
CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATTATTCACA 

10 

AACATATTAT GCCATTAACA TT AG CAG ATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 
CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGTCAAA GCGCCTACTG 
, s CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT GTTTACACAT CCTGAAATGA 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 
CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 
20 AAGGAGTGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 

GACAGATCAA CCACTCGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 
CGTTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAATCGATTA TTGGTTTGAA 
25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT 

ATCTGAATCG CAATTGAAAA AGTACCGTGG TAAAGACATT GCGATGGTCA TGCAACAAGG 
TAGTCGTGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 
ACATACGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA TGGATTATTT 
AAGTTTGAAA GATCCTAAAC GTATATTAAA ATCATACCCT TACATGTTAT CAGGAGGAAT 
GTTACAGCGA TTGATGATTG CTTTAGCGTT AgcTTTgAAA CCAAAGTTAA TCATTG CTGA 
TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 
TATTAAAAAA CACTTTGACT GTG CGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 
4Q CAAGATTGCA GACCGTGTTG TTGTGATGAA AAATGGTCAG CTTATTGAAC AAGGGACACG 

TGAATCAGTC TTGCATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 
GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 
45 ATGTTGAAAA GTCATATCAA AGCGCACATG TTTTTAAGCG TCGTCGAACA CCTATCGTGA 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 
GCGGTAAATC GACGTTGAGT Ckt ATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 
so TAACCTTAAA TGATCAACCG ATG CAT AAG A AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA ATCTTATTTG 
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TGTTGGAAGA 


AGTCGGTCTA 


TCTAAGGCAT 


ACATGGATAA 


ATATCCTAAT 


ATGTTATCAG 


8700 




GTGGAGAGGC 


GCAACGTGTT 


GCGATTGCGC 


GTGCAATATG 


TATTAACCCT 


AAATATATTT 


8760 


5 


TGTTTGATGA 


AGCCATTAGT 


TCACTCGACA 


TGTCAATTCA 


AACACAAATA 


TTAGATTTAT 


8820 




TGATTCATTT 


ACGTGAAACG 


CGTCAGTTGA 


GTTATATTTT 


TATCACACAT 


GATATTCAAG 


8880 




CTGCCACGTA 


TTTATGTGAT 


CAATTAATTA 


tttttaaaaa 


CGGAAAAATA 


GAAGAACAAA 


8940 


10 


TTCCGACAAG 


CGCATTGCAT 


AAAAGTGACA 


ATGCTTATAC 


AAGAGAATTA 


ATAGAAAAAC 


9000 




AACTATCATT 


CTAAGGAGTG 


AGATAATGAA 


AGGTGCAATG 


GCTTGGCCCT 


TTTTGAGATT 


9060 


15 


ATATATATTA 


ACATTGATGT 


TCTTTAGTGC 


CAATGCAATC 


TTAAACGTGT 


TTATACCTTT 


9120 


ACGAGGGCAT 


GATTTAGGCG 


CAACGAATAC 


GGTTATCGGT 


ATCGTTATGG 


GGGCATACAT 


9180 




GTTAACAGCA 


ATGGTATTTC 


GACCATGGGC 


AGGACAAATT 


ATTGCTCGTG 


TCGGTCCCAT 


9240 


20 


TAAAGTATTA 


AGAATTATTT 


TGATTATCAA 


TGCCATAGCT 


TTAATTATTT 


ATGGTTTTAC 


9300 




TGGCTTAGAA 


GGTTATTTCG 


TAG CACGTGT 


TATGCAAGGT 


GTGTGTACGG 


CATTCTTTTC 


9360 




TATGTCTTTA 


CAGCTAGGTA 


TTATTGATGC 


ATTACCAGAG 


GAACATCGTT 


CTGAAGGTGT 


9420 


25 


ATCATTGTAC 


TCGCTATTTT 


CAACGATTCC 


AAACTTAATC 


GGACCATTAG 


TTGCCGTAGG 


9480 




TATTTGGAAT 


GCAAATAATA 


TTTCACTATT 


TGCAATTGTC 


ATTATCTTTA 


TCGCATTAAC 


9540 




AACAACATTC 


TTTGsTATCG 


CGTGACCTTT 


GCTGAACAGG 


AACCCGATAC 


GTCAGATAAG 


9600 


30 


ATTGAAAAAA 


TGCCGTTTAA 


CGCTGTAACT 


GTTTTTGCGC 


AATTTTTCAA 


AAATAAAGAG 


9660 




TTGTTGAACA 


GTGGTATTAT 


CATGATTGTT 


GCATCGATTG 


TATTTGGTGC 


AGTTAGTACA 


9720 




TTTGTACCGT 


TATACACAGT 


GAGTTTAGGA 


TTCGCGAATG 


CGGGAATCTT 


TTTGACAATA 


9780 


35 


CAGGCCATCG 


CAGTTGTTGC 


GGCAAGATTT 


TACTTAAGGA 


AATACATTCC 


GTCAGATGGT 


9840 




ATGTGGCATC 


CTAAATATAT 


GGTATCTGTA 


CTATCATTAT 


TAGTAATCGC 


GTCATTTGTA 


9900 


40 


GTGGCATTTG 


GTCCGCAAGT 


AGGTGCAATT 


ATTTTCTATG 


GTAGTGCGAT 


ATTAATAGGA 


9960 


ATGACGCAAG 


CAATGGTGTA 


CCCAACATTA 


ACATCATACT 


TAAGCTTCGT 


CTTACCAAAA 


10020 




GTAGGTCGTA 


ATATGTTGTT 


AGGTTTATTT 


ATTGCCTGTG 


CAGACTTAGG 


TATATCGTTA 


10080 


45 


GGTGGCGCAT 


TGATGGGACC 


TATTTCCGAT 


TTAGTAGGAT 


TTAAATGGAT 


GTATCTAATT 


10140 




TGTGGTATGT 


TAGTCATTGT 


AATAATGATT 


ATGAGTTTCT 


TGAAAAAGCC 


AACACCACGT 


10200 



CCAGCGAGTA GTCTTTAATG AAGTGAATTA AAG CAT ATTA AGTTAATGAA TATTTAAATT 10260 
TTAAAAGGTA TATTGaGCAT GGCGATTCAT GTGCTTCATG CTAGGACATG AAACATTCTA 10320 
TATGGCTCGT TTTTAGAACG ACAtATATCT AAATAAAGCA CGCTTArAAG TGAGTTTTGA 10380 
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SO 



TTACATGAAA 
ATTACGGTAT 
GAATTCCCAC 
TACTATTGTA 
AATAAGTTGA 
GATGGAaCCA 
TAGTGCGAAA 
TATAAGAATA 
AAACGCATCT 
GCTAATGTTC 
TACTAACCCA 
AAGGAAAATA 
TTCTATGTAA 
GATACTTTAG 
AACATATCAT 
TGTATCATAA 
TCATGCAATT 
GGCTTATTTG 
CTTTACTAGC 
TATTTT CTGG 
TGGCfiGGAAA 
GCTTTTACAT 
CAACATTGTT 
AATATCGTGT 
TAGGTGAACA 
GTTATGGCGC 
GCACTTGGAT 
ATATCATTTT 
GTGAAGGTTT 



ATATGCAAAA 
GATTTTAAAT 
CATGTATTAA 
ATCTTTAAAG 
CAAATCAAAG 
TACCCTTTAA 
ATATTGAAAT 
TGCCATTTAT 
TCATG CAT CA 
ATTGAAATAG 
AAAGTTAAAA 
AATCCAATAT 
TGTTTATTTG 
CACATATTAC 
AAAATTGTTT 
ATATTGAATT 
TATAGTTAAC 
GTTAATTAAA 
ACGTATTAGA 
GGATAATGAA 
ATATAACTCT 
ACAGAACACA 
ATCAACATTC 
GCCGACAAAG 
TTTAAAACAT 
TTTAGGAAAA 
GAATACAGGT 
CCAAATTGGT 
ATTTAAAGAG 



CGAGTATAAC 

ATAAGTAAGT 
TGTATGGATA 
GTATTAATCT 
TATTTAATAT 
TGAGCGGGTA 
GATTTAAAAG 
ATTTAGCACT 
GACGAAAAAT 
GAAAAGAGAA 
AGACGATAAT 
CACGTTTGAA 
TATTTGACAT 
TTTGTATTGT 
TATAAAATGA 
GAAATTTTGG 
ATTATCGTTG 
GATAAAAGAC 
TATATTTCAG 
GGGAAACCTT 
CGTATGACCA 
ATGTTTCCGA 
ATTTATAAAA 
ATTGATCCGT 
CCATTTATTT 
AATG CCATTA 
GAAGGTGGCT 
CCCGGTTTAT 
GTTG CACAGT 



TGCTAATTGA 
CGCACTACCT 
GTAGAACAGA 
GCTTAATTCT 
AATGGTTAAC 
AATGTCAAAG 
TAAAAAGAGT 
AGCAACGATT 
AGCTAGTGAA 
ACCCCACGGA 
GATCGGCAAG 
AAAACGCGAT 
AGTATACCTC 
ATG TTTT AT A 
AGCGCTTCCA 
GGGGAGGTAT 
TAGGATTCAT 
AATCACAACA 
AAAAAATGGG 
TTTCACGTAA 
GCTTCGGTAC 
TGCAACGTAA 
TCGCGAATGA 
ATTACTTAAG 
TAAAACGTAT 
CAGCTTTATC 
TATCAGAATA 
TTGGTGTTCG 
TATCTAACGT 



TAGAAATAGC 
GCTAGTATCA 
GTTTCAAGGA 
TGAATTAAAA 
GAAAATATAG 
ACAGTAAAGG 
ACGACACTTA 
TGCGAACGTA 
ATAATAACTG 
GCTTGTTGAG 
ATGTTAACCA 
TGTTCGGTAG 
TTAAATAGTT 
CATTAAAATT 
TTGTGTTTTG 
TGTAATGACG 
GCTTACGGTT 
TAGTGTATTA 
ACCGGAATTA 
TGATTATAAA 
TACTAAAGAT 
TGAGATTTCA 
GCGTTTATTT 
TGATGACCAT 
CGTAGGACAA 
TAAAGGTCTA 
TCATTTAAAA 
TGATAAAGAA 
ACGCGCATTT 



TCACCATAAA 
ATGCTGGAAT 
TAATGGACAA 
TATGACGGAA 
CTATTAAACT 
AATCTACATT 
GTG TAAATG A 
TCATTGGAAT 
CGAGTAAATA 
TGAATACAGC 
AAAATATGTA 
CGTATTCTTC 
GTATTATATA 
TAAAATGAAA 
TTTTGTAAGG 
TTTCTTACAG 
ATTGTTATCG 
AGGAATTATC 
CGTCAGTATT 
AATATCGTTT 
TATCAAGACG 
GTAGATAATA 
AGTCGTGAAG 
GCAATAAAAT 
TCTGGTATGA 
GCTAAAGCGG 
GGTAATGGGG 
GGTAATTTTA 
GAGCTGAAGT 
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TTGCTAAAAT CCGAAATGTT GAACCTTATA AAACAATCAA TTCACCTAAC CGTTACGAAT 12300 

TTATTCATAA TGCTGAAGAT TTGATTCGTT TCGTCGATCA GTTGCAGCAA TTAGGTCAAA 12360 

AACCAGTAGG ATTCAAAATT GTAGTAAGCA AAGTTTCAGA AATTGAAACA CTTGTACGTA 12420 

CGATGGTGGA ACTAGATAAG TATCCAAGCT TTATTACGAT TGATGGTGGT GAAGGTGGTA 12480 

CTGGTGCAAC ATTCCAAGAA TTACAAGATG GTGTTGG CTT ACCG CTATTT ACAG CTCTAC 12 540 

CTATTGTGTC TGGCATGTTA GAAAAATATG GTATTCGAGA TAAAGTGAAA TTGG CGG CAT 12 600 

CTGGTAAGTT AGTGACACCA GATAAAATTG CGATTG CACT AGGTTTAGGT G CAG ATTTTG 12 66 0 

TAAATATCGC ACGTGGGATG ATGATTAGTG TCGGTTGTAT AATGAGTCAA CAATGTCACA 12720 

TGAATACGTG TCCTGTAGGT GTTGCAACGA CAGATGCGAA GAAAGAAAAA G CATTG ATTG 12780 

TTGGAGAAAA GCAATATCGT GTCACAAACT ATGTAACAAG TTTGCATGAA GG CTTATTCA 12 840 

ATATTGCAGC AGCTGTTGGC GTATCCAGTC CTACAGAAAT TACTGCTGAT CATATTGTAT 12 900 

ATCGAAAAGT CGATGGTGAG TTACAAACGA TACATGATTA TAAATTAAAA CTCATTAGTT 12 960 

AACTTAATTA TTTCGGGAAA TTGAAAGCAG CGGATTTTAG CGTTACTGCA AATAATTTTA 13 020 

25 TATTAGTAGT GGATGCTGGT CACACAAGAA CTTCAAATAT TAAAGCCCTC AGAATATGAA 13080 

TTAAGGTTTG TAACCTTAGT CTTATCTGAG GG CATTTTTA AGTTATAAAC TATTTGTCGT 13140 

CCATTTTATC TTTTTCTTTT AAAC CTCTGT GCTTTAATTG CTTTTCAAGT TTTTCAAAAC 13 2 00 

TAATATCTTT ATTTTCTTTA GTCGAAACAC CAAGACGTTT ATTTAATTTT TTCATGTCAA 13260 

CTTCTGTGTA ATCTATGTCT AAGTGyTCAA TTGCTTTTTT ATCTTTATAG TCTACTTTGT 13 320 

ATTTTACGCC TTTAAGGTCT TTGAAAATAC TTTCAGATTT GGCGAATAAC TTTTTGGCTT 13 380 

CGTCTTTATC CATACCTAGA TCGTCATATT TAATTGTGTT GATTGTAGAcT TGTTTTAAAA 13440 

CTTEaTCATC TTTATATGTG ATAGAAGTTA GTACATGTTT ACCACTAACA TCACCwTCAT 13 500 

ATGTTTTGGT TTGTTCTTTA CCACAAGCTG ATAATGCAAT GATACAAACT AATG CTACTA 13560 

CAATTAATGA ACATAATTTT TTCAAAGTCA GTCGCCTTCT TTCGATATTT GTATTATAAA 13 620 

GAAATTATAA CATTTACTAA AAAATGATGT TATTCAAAAA TTTAAATTTT GTCATTTTTT 13 680 

TTGAAGATAT GAGTTTTTTT AAGCGGATTC CTCACAAAAT TTTAAAAATA TTTAAGCCTk 13740 

AAAATGATAA AGCGkTAGGG AACGTTTTTC TGAAAGTTAG TGATACAATA GTTTTAAGTT 13800 

GAAATACAGG AGGATGAATA ACATGAATCA GTCAGTCAAA TTACTTAAAC ATTTAACAGA 13 860 

SO TGTAAACGGC ATTGCTGGTT ATGAAATGCA AGTTAAAGAA GCAATGCGTa ACTATATAGA 13 920 

GCCTGTCAGT GATCAAATTA TTGAAGATAA CTTGGGTGGC ATTTTTGGAA AGAAAAATGC 13980 
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TAAAGATGTA 
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ATAACATCGA 
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